Re: [Haskell-cafe] ANNOUNCE: text 0.8.0.0, fast Unicode text support

2010-09-08 Thread Daniel Fischer
On Wednesday 08 September 2010 06:44:32, Bryan O'Sullivan wrote:
 I have a Replace.hs benchmark in the main text repo, just to be sure
 we're talking about the same thing.

Grabbed the darcs repo, compiled with the darcs version and also with the 
installed text package, both exhibit the same behaviour - what I reported 
before.

 Factoring out the time spent on I/O,
 with GHC HEAD, my replace code takes twice the space and time of that in
 the stringsearch package.

That's very good. Twice the space is a given for mostly ASCII text, twice 
the time is okay, I think.
My timings are quite different, but that's probably because 6.12.3's 
inliner doesn't give the full fusion benefit, so it'll improve 
automatically with the next GHC release.

 Given that the space involved is just 121KB
 maximum residency while processing a 124MB file, I'm not concerned about
 it.

I wouldn't either.

 And the time required isn't a bad place to start from, I think.

 By the way, as this implies, I can't reproduce your space behaviour at
 all.


That's surprising.
Have you made sure to replace a pattern which does not occur in the text?
Can you reproduce the behaviour with a) Data.List.intersperse instead of 
the lazier version now used, b) ghc-6.12.* instead of HEAD?

Anyway, I would've thought that with

split pat src
| null pat= emptyError split
| isSingleton pat = splitBy (== head pat) src
| otherwise   = go 0 (indices pat src) src
  where
go  _ [] cs = [cs]
go !i (x:xs) cs = let h :*: t = splitAtWord (x-i) cs
  in  h : go (x+l) xs (dropWords l t)
l = foldlChunks (\a (T.Text _ _ b) - a + fromIntegral b) 0 pat

you can't start returning chunks before it's known whether the list of 
indices is empty, so split would have O(index of pattern) space behaviour.

If HEAD manages to make the chunks available before they are complete 
(before it's known how long they will be), it's even awesomer than I'd have 
dared to hope.
Okay, so I'll have to try HEAD.

  I can now say more. Looking at Data.Text.Lazy.replace,
 
  replace s d = intercalate d . split s
 
  , I also got a space leak with that for BS.Lazy's intercalate and
 
  stringsearch's split.

 How did you observe the space leak? Looking at -hT charted with hp2ps
 shows me nothing, and the program executes in constant space regardless
 of input size.

top and +RTS -s. I've attached the -hT graph of a run on a 74.3MB file with 
a pattern that does not occur. It shows exactly the behaviour I expected 
from the code of split, pretty constant growth of the heap until about 
twice the file size is reached, then fast and pretty constant shrinking as 
the text is output. The graphs are much more interesting if you do 
replacements of patterns without large gaps between occurrences.



nbenchText.ps
Description: PostScript document
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ANNOUNCE: text 0.8.0.0, fast Unicode text support

2010-09-08 Thread Daniel Fischer
On Wednesday 08 September 2010 13:46:13, Daniel Fischer wrote:
 My timings are quite different, but that's probably because 6.12.3's
 inliner doesn't give the full fusion benefit, so it'll improve
 automatically with the next GHC release.


Or maybe not so much. Just built the latest source bundle from the HEAD 
branch and

6.12.3:
./nbench lazyText bigfile krkx rabi +RTS -s
   1,796,245,884 bytes allocated in the heap
   1,125,596 bytes copied during GC
 110,398,048 bytes maximum residency (8 sample(s))
  38,897,164 bytes maximum slop
 191 MB total memory in use (4 MB lost due to fragmentation)

  Generation 0:  3043 collections, 0 parallel,  3.06s,  3.17s elapsed
  Generation 1: 8 collections, 0 parallel,  0.00s,  0.01s elapsed

  INIT  time0.00s  (  0.00s elapsed)
  MUT   time6.03s  (  6.48s elapsed)
  GCtime3.07s  (  3.18s elapsed)
  EXIT  time0.00s  (  0.00s elapsed)
  Total time9.10s  (  9.66s elapsed)

  %GC time  33.7%  (33.0% elapsed)

  Alloc rate297,965,335 bytes per MUT second

  Productivity  66.3% of total user, 62.4% of total elapsed

6.13.20100831:
./hdbench lazyText bigfile krkx rabi +RTS -s

 543,409,296 bytes allocated in the heap

 699,364 bytes copied during GC 

 110,956,008 bytes maximum residency (8 sample(s))  

  38,893,040 bytes maximum slop 

 191 MB total memory in use (4 MB lost due to fragmentation)


  Generation 0:   652 collections, 0 parallel,  0.44s,  0.43s elapsed
  Generation 1: 8 collections, 0 parallel,  0.00s,  0.01s elapsed

  INIT  time0.00s  (  0.00s elapsed)
  MUT   time5.42s  (  5.77s elapsed)
  GCtime0.44s  (  0.44s elapsed)
  EXIT  time0.00s  (  0.00s elapsed)
  Total time5.86s  (  6.21s elapsed)

  %GC time   7.5%  (7.1% elapsed)

  Alloc rate100,327,729 bytes per MUT second

  Productivity  92.5% of total user, 87.2% of total elapsed


Sure, that's a significant improvement, but that's mostly the GC time, with 
-A64M, 6.12.3 is much closer.

However, for ByteStrings, performance got worse:

6.12.3:
./nbench lazyBS bigfile krkx rabi +RTS -s   
  
  90,127,112 bytes allocated in the heap
  
  31,116 bytes copied during GC 
  
 103,396 bytes maximum residency (1 sample(s))  
  
  39,964 bytes maximum slop 
  
   2 MB total memory in use (0 MB lost due to fragmentation)
  

  Generation 0:   158 collections, 0 parallel,  0.00s,  0.00s elapsed
  Generation 1: 1 collections, 0 parallel,  0.00s,  0.00s elapsed

  INIT  time0.00s  (  0.00s elapsed)
  MUT   time0.10s  (  0.20s elapsed)
  GCtime0.00s  (  0.00s elapsed)
  EXIT  time0.00s  (  0.00s elapsed)
  Total time0.11s  (  0.20s elapsed)

  %GC time   3.6%  (1.8% elapsed)

  Alloc rate834,456,211 bytes per MUT second

  Productivity  92.9% of total user, 50.9% of total elapsed

6.13.20100831:
./hdbench lazyBS bigfile krkx rabi +RTS -s  

 478,710,672 bytes allocated in the heap

 164,904 bytes copied during GC 

  86,992 bytes maximum residency (1 sample(s))  

  44,080 bytes maximum slop 

   2 MB total memory in use (0 MB lost due to fragmentation)


  Generation 0:   864 collections, 0 parallel,  0.00s,  0.01s elapsed
  Generation 1: 1 collections, 0 parallel,  0.00s,  0.00s elapsed

  INIT  time0.00s  (  0.00s elapsed)
  MUT   time0.17s  (  0.28s elapsed)
  GCtime0.00s  (  0.01s elapsed)
  EXIT  time0.00s  (  0.00s elapsed)
  Total time0.18s  (  0.29s elapsed)

  %GC time   2.3%  (4.1% elapsed)

  Alloc rate2,783,039,776 bytes per MUT second

  Productivity  95.5% of total user, 57.3% of total elapsed


Not only got it slower, it also allocates more than five times as much as 
before.

  Given that the space involved is just 121KB
  maximum residency while processing a 124MB file, I'm not concerned
  about it.

 I wouldn't either.


But it needs more space here, so I am concerned.

  And the time required isn't a bad place to start from, I think.
 
  By the way, as this implies, I can't reproduce your space behaviour at
  all.

 That's surprising.
 Have you made sure to replace a pattern which does not occur in the
 text? Can you 

Re: [Haskell-cafe] ANNOUNCE: text 0.8.0.0, fast Unicode text support

2010-09-07 Thread Bryan O'Sullivan
On Wed, Sep 1, 2010 at 12:29 PM, Daniel Fischer daniel.is.fisc...@web.dewrote:

 I'm on Linux. I guess that's another point in favour of it:)
 Do you happen to know why it's slower on a Mac?


I'd guess because of something to do with the system iconv.


 So I tentatively believe most of the difference is spent doing the
 replacements.


I have a Replace.hs benchmark in the main text repo, just to be sure we're
talking about the same thing. Factoring out the time spent on I/O, with GHC
HEAD, my replace code takes twice the space and time of that in the
stringsearch package. Given that the space involved is just 121KB maximum
residency while processing a 124MB file, I'm not concerned about it. And the
time required isn't a bad place to start from, I think.

By the way, as this implies, I can't reproduce your space behaviour at all.

I can now say more. Looking at Data.Text.Lazy.replace,

 replace s d = intercalate d . split s

, I also got a space leak with that for BS.Lazy's intercalate and
 stringsearch's split.


How did you observe the space leak? Looking at -hT charted with hp2ps shows
me nothing, and the program executes in constant space regardless of input
size.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ANNOUNCE: text 0.8.0.0, fast Unicode text support

2010-09-01 Thread Ivan Lazar Miljenovic
On 1 September 2010 15:14, John Millikin jmilli...@gmail.com wrote:
 [aside] does anybody know how to get a list of what packages
 somebody's uploaded to Hackage? I think I've updated all mine for the
 new text version dependency, but I'm worried I forgot some.

Unfortunately, no; I've often wished the same thing (not for my
packages, but because I could have sworn someone uploaded a package
that did something, I remembered the maintainer but not the name of
the package...).

-- 
Ivan Lazar Miljenovic
ivan.miljeno...@gmail.com
IvanMiljenovic.wordpress.com
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ANNOUNCE: text 0.8.0.0, fast Unicode text support

2010-09-01 Thread Tako Schotanus
On Wed, Sep 1, 2010 at 07:14, John Millikin jmilli...@gmail.com wrote:


  Don't forget, you can always improve the text library yourself. I love to
 receive
  patches, requests for improvement, and bug reports.

 Are there any areas in particular you'd like help with, for either
 library? I'm happy to assist any effort which will help reduce use of
 String.


As a Haskell noob I'm curious about this statement, is there something
intrinsically wrong with String?
Or is it more a performance/resource problem when dealing with large amounts
of text for example?
(Like having to use StringBuilder in Java if you want to avoid the penalty
of repeated String allocations when simply concatenating for example)

Cheers,
 -Tako
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ANNOUNCE: text 0.8.0.0, fast Unicode text support

2010-09-01 Thread David Virebayre
2010/9/1 Tako Schotanus t...@codejive.org:
 As a Haskell noob I'm curious about this statement, is there something
 intrinsically wrong with String?

String is just a linked list of Char which are unicode code points;
which is probably not the optimal way to store text.
For intensive use of text it takes too much memory, and/or it's not
fast enough.

Sometimes I'd love if I could program using String and the compiler
would automatically convert that to Text, or Bytestrings, but of
course it's wishful thinking.

David.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ANNOUNCE: text 0.8.0.0, fast Unicode text support

2010-09-01 Thread Daniel Fischer
On Wednesday 01 September 2010 06:08:07, Bryan O'Sullivan wrote:
 New in this release:

- Substantial performance improvements.
- Bug fixes, and better quality assurance via automated QuickCheck
 and HPC tests and coverage.
- An even nicer API than before.

Good job.


 The text library provides an efficient packed, immutable Unicode text
 type (both strict and lazy), with a powerful loop fusion optimization
 framework.

 The 'Text' type represents Unicode character strings, in a time
 and space-efficient manner. This package provides text
 processing capabilities that are optimized for performance critical use,
 both in terms of large data quantities and high speed.

Hmm, I have to throw a few grains of salt into that.

File: big.txt from Peter Norvig's spelling checker (6.2MiB)

1. read the file and output it to stdout (redirected to /dev/null)
ByteString.Lazy: = 0.01s
text-0.7.2.1 (Lazy): 0.68s
text-0.8.0.0 (Lazy): 0.26s
String: 0.35s

ByteString doesn't do any de/encoding or validity checking, so it's clear 
that can be way faster than I/O which does.
The interesting part is the comparison between text and vanilla String I/O, 
the difference is smaller than I expected for text-0.8.0.0. It came as a 
surprise that text-0.7.2.1 actually had much worse performance than vanilla 
Strings.

Okay, now we know how long the I/O takes, let's measure some work.

2. read the file, replace some string, output to stdout (redirected to 
/dev/null)
a) Professor - Teacher
stringsearch (Lazy): 0.02s, 2MB total
text-0.7.2.1 (Lazy): 0.92s, 28MB total
text-0.7.2.1 (Strict): 0.90s, 65MB total
text-0.8.0.0 (Lazy): 0.44s, 12MB total
text-0.8.0.0 (Strict): 0.36S, 55MB total
String (naive): 0.51s, 1MB total
String (KMP): 0.46s, 1MB total

b) deteriorate - worsen
stringsearch (Lazy): 0.02s, 2MB total
text-0.7.2.1 (Lazy): 1.01s, 39MB total
text-0.7.2.1 (Strict): 0.91s, 65MB total
text-0.8.0.0 (Lazy): 0.45s, 17MB total
text-0.8.0.0 (Strict): 0.36s, 55MB total
String (naive): 0.52s, 1MB total
String (KMP): 0.50s, 1MB total

c) hen - here
stringsearch (Lazy): 0.04s, 2MB total
text-0.7.2.1 (Lazy): 1.10s, 2MB total
text-0.7.2.1 (Strict): 0.95s, 65MB total
text-0.8.0.0 (Lazy): 0.66s, 2MB total
text-0.8.0.0 (Strict): 0.41s, 55MB total
String (naive): 0.53s, 1MB total
String (KMP): 0.51s, 1MB total

d) kre - here
stringsearch (Lazy): 0.03s, 2MB total
text-0.7.2.1 (Lazy): 1.23s, 39MB total
text-0.7.2.1 (Strict): 0.96s, 65MB total
text-0.8.0.0 (Lazy): 0.65s, 17MB total
text-0.8.0.0 (Strict): 0.40s, 55MB total
String (naive): 0.51s, 1MB total
String (KMP): 0.47s, 1MB total

The performance of text-0.8.0.0 has improved significantly over that of 
text-0.7.2.1 (for the tested features), the improvement of the replacing 
algorithm is however not as impressive as that of I/O.

Replacing isn't much faster than for Strings, however, for (some?) very 
short patterns, even slower.
That's not so good.

What's *really* bad is the space behaviour.

My tests suggest that the space usage (for Text.Lazy) depends on the index 
of the pattern to replace, the later it appears (if at all), the more 
memory is used by text. Apparently it doesn't deliver the beginning before 
a match has been found (or the whole string has been scanned).
That space leak ought to be fixed.

The space behaviour of the strict variant doesn't show that problem, it 
seems to only depend on the file, but it uses disproportionally much 
memory.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ANNOUNCE: text 0.8.0.0, fast Unicode text support

2010-09-01 Thread Bryan O'Sullivan
Hi, Daniel -

Thanks for taking the new code for a test drive!


 The interesting part is the comparison between text and vanilla String I/O,
 the difference is smaller than I expected for text-0.8.0.0.


Yes. Much of this is due to the new encoding stuff on Handles in GHC 6.12,
which is slow. Its performance wasn't so noticeable when it was only
shipping String around, but it's much more visible with Text. It's far
slower on a Mac than on Linux, in case that's relevant.


 The performance of text-0.8.0.0 has improved significantly over that of
 text-0.7.2.1 (for the tested features), the improvement of the replacing
 algorithm is however not as impressive as that of I/O.


I'd bet you that's mostly because the program is I/O bound, in the sense
that it's spending time going through the layers of buffering and
translation that now make up a Handle. Any improvement in other code is
going to be hard to see because of that.

The other major consideration, both for this case and the first one you
note, is that the inliner in 6.12 chokes on code that uses stream fusion: it
boxes and unboxes vast quantities of state. That kills performance due to
both the boxing and unboxing overhead and the increased number of nursery
GCs.

The marvelous new 6.13 inliner does a *much* better job here - that's where
I see those 3x performance improvements for free.

What's *really* bad is the space behaviour.


What are you using to measure that?

Also, please don't forget to post your benchmark code when you make
observations like this, as that way I can reproduce your measurements and
fix problems. I appreciate your help!

Regards,
Bryan.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ANNOUNCE: text 0.8.0.0, fast Unicode text support

2010-09-01 Thread Bryan O'Sullivan
On Tue, Aug 31, 2010 at 10:14 PM, John Millikin jmilli...@gmail.com wrote:

 Is there a summary of the API changes available? I see a new module,
 but Precis is choking on Data.Text and Data.Text.Lazy, so I'm not sure
 what existing signatures have been modified.


Ouch. I'll try to do a diff and follow up, probably not until much later.


 Are there any areas in particular you'd like help with, for either
 library? I'm happy to assist any effort which will help reduce use of
 String.


For the text library, my main interest is performance. I've been far more
focused on making the code correct and complete than on making it fast, but
now that the API is stable and the test coverage is solid, performance
deserves more attention.

What specifics would be useful?

   - Reproducible, realistic (but preferably small) benchmarks
   - Patches that speed things up

Thanks for asking!
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ANNOUNCE: text 0.8.0.0, fast Unicode text support

2010-09-01 Thread Daniel Fischer
On Wednesday 01 September 2010 18:15:19, Bryan O'Sullivan wrote:
 Hi, Daniel -

 Thanks for taking the new code for a test drive!

  The interesting part is the comparison between text and vanilla String
  I/O, the difference is smaller than I expected for text-0.8.0.0.

 Yes. Much of this is due to the new encoding stuff on Handles in GHC
 6.12, which is slow. Its performance wasn't so noticeable when it was
 only shipping String around, but it's much more visible with Text. It's
 far slower on a Mac than on Linux, in case that's relevant.


I'm on Linux. I guess that's another point in favour of it:)
Do you happen to know why it's slower on a Mac?

  The performance of text-0.8.0.0 has improved significantly over that
  of text-0.7.2.1 (for the tested features), the improvement of the
  replacing algorithm is however not as impressive as that of I/O.

 I'd bet you that's mostly because the program is I/O bound, in the sense
 that it's spending time going through the layers of buffering and
 translation that now make up a Handle. Any improvement in other code is
 going to be hard to see because of that.

Maybe. As a rough measure, I took (total time - time for just reading and 
outputting) for an approximation of the processing (replacing) time.
For the first example (Lazy), that yields
0.7.2.1: 0.92s - 0.68s = 0.24s
0.8.0.0: 0.44s - 0.26s = 0.18s,
for the fourth
0.7.2.1: 1.23s - 0.68s = 0.55s
0.8.0.0: 0.65s - 0.26s = 0.39s

Yes, I know it's very crude, but TIO.readFile file = TIO.putStrLn does no 
less translation than with a replacing in between, or does it?
So I tentatively believe most of the difference is spent doing the 
replacements.


 The other major consideration, both for this case and the first one you
 note, is that the inliner in 6.12 chokes on code that uses stream
 fusion: it boxes and unboxes vast quantities of state. That kills
 performance due to both the boxing and unboxing overhead and the
 increased number of nursery GCs.

I haven't looked at the core, so I take your word for it. I know the 
behaviour from other situations.


 The marvelous new 6.13 inliner does a *much* better job here - that's
 where I see those 3x performance improvements for free.

That's nice. Maybe I should go get the HEAD before 6.14.1 will be released.


 What's *really* bad is the space behaviour.


 What are you using to measure that?

+RTS -s and top. The figures I gave were the total memory in use of
+RTS -s, maximum residency was 20MB (vs. 39 total) for 0.7 and 12MB (vs. 17 
total) for 0.8 in the bad tests.


 Also, please don't forget to post your benchmark code when you make
 observations like this,

It's not very interesting,

{-# LANGUAGE BangPatterns #-}
module Main (main) where

import System.Environment (getArgs)
import qualified Data.Text.Lazy as T
import qualified Data.Text.Lazy.IO as TIO

main :: IO ()
main = do
(file : pat : sub : _) - getArgs
let !spat = T.pack pat
!ssub = T.pack sub
work = T.replace spat ssub
TIO.readFile file = TIO.putStrLn . work

with the obvious changes of imports for the strict Text and the almost 
obvious changes for the ByteString code.

 as that way I can reproduce your measurements
 and fix problems. I appreciate your help!

I can now say more. Looking at Data.Text.Lazy.replace,

replace s d = intercalate d . split s

, I also got a space leak with that for BS.Lazy's intercalate and 
stringsearch's split. Looking at intercalate,

intercalate t = concat . (Data.List.intersperse t)

, that's where you definitely get a space leak, because

intersperse :: a - [a] - [a]
intersperse _   []  = []
intersperse _   [x] = [x]
intersperse sep (x:xs)  = x : sep : intersperse sep xs

isn't lazy enough.
Given the context, it's not hard to see what's wrong here. Before 
intersperse produces any output, it checks whether the list contains at 
least two elements (if any). If a is a list-like type, like String, lazy 
ByteStrings or lazy Text, where the elements of the list are lazily 
produced one after the other, as is the case for split, each element must 
be complete before it can be delivered and then consumed.
So indeed, replace needs O(index pat) space :(

I don't think that fixing Data.List.intersperse will fix your space leak, 
though.

split pat src
| null pat= emptyError split
| isSingleton pat = splitBy (== head pat) src
| otherwise   = go 0 (indices pat src) src
  where
go  _ [] cs = [cs]
go !i (x:xs) cs = let h :*: t = splitAtWord (x-i) cs
  in  h : go (x+l) xs (dropWords l t)
l = foldlChunks (\a (T.Text _ _ b) - a + fromIntegral b) 0 pat

Using the list of indices of the pattern, split can't deliver anything 
before the first (next) occurrence of the pattern has been located (or the 
end of the string reached). Again, that forces O(index pat) space 
consumption.

I don't see how to avoid that without duplicating a large part of indices' 
logic in a dedicated 

Re: [Haskell-cafe] ANNOUNCE: text 0.8.0.0, fast Unicode text support

2010-09-01 Thread Brandon S Allbery KF8NH

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 09/01/2010 02:44 AM, Ivan Lazar Miljenovic wrote:
 On 1 September 2010 16:27, David Virebayre
 dav.vire+hask...@gmail.com wrote:
 Sometimes I'd love if I could program using String and the
 compiler would automatically convert that to Text, or
 Bytestrings, but of course it's wishful thinking.
 Well, there's OverloadedStrings for String literals...

 But the problem with automagic conversion would be one of encoding
 for Bytestrings at least.

This reminds me:  it would be nice if Text, ByteString, etc. exported
their own definitions of type String.  As most users have to import
them qualified to avoid collisions with Data.List (and the Prelude?)
anyway, this would enable someone to import their desired module as,
say, S, then use type S.String and string ops S.foo.  Anyone who *did*
use them unqualified would have to add (hiding (String)) to their
imports, of course.

The flip side of this would be a module which contained the definition
for String and re-exported Data.List, so that the above S.mumble could
work for String as well.  (The obvious choice would be Data.String,
but that's taken for IsString and of course that would be required to
pull this off so it can't just have the list implementation stuffed
into it.)
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkx/FsQACgkQIn7hlCsL25Vk8ACeKco9/IFwGi+8gc+BxSyT3QwY
oP8An3oabG1lbXHChShbenEWj9HWEq6Q
=PQsO
-END PGP SIGNATURE-

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] ANNOUNCE: text 0.8.0.0, fast Unicode text support

2010-08-31 Thread Bryan O'Sullivan
New in this release:

   - Substantial performance improvements.
   - Bug fixes, and better quality assurance via automated QuickCheck and
   HPC tests and coverage.
   - An even nicer API than before.

The text library provides an efficient packed, immutable Unicode text type
(both strict and lazy), with a powerful loop fusion optimization framework.

The 'Text' type represents Unicode character strings, in a time
and space-efficient manner. This package provides text
processing capabilities that are optimized for performance critical use,
both in terms of large data quantities and high speed.

The text package provides character-encoding, type-safe case conversion via
whole-string case conversion functions. It also provides a range of
functions for converting Text values to and from ByteStrings, using several
standard encodings (see the text-icu package for a much larger variety of
encoding functions). Efficient locale-sensitive support for text I/O is also
supported.

http://www.serpentine.com/blog/2010/09/01/major-version-of-the-haskell-text-library-0-8-0-0/
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ANNOUNCE: text 0.8.0.0, fast Unicode text support

2010-08-31 Thread John Millikin
Is there a summary of the API changes available? I see a new module,
but Precis is choking on Data.Text and Data.Text.Lazy, so I'm not sure
what existing signatures have been modified.

 Don't forget, you can always improve the text library yourself. I love to 
 receive
 patches, requests for improvement, and bug reports.

Are there any areas in particular you'd like help with, for either
library? I'm happy to assist any effort which will help reduce use of
String.

[aside] does anybody know how to get a list of what packages
somebody's uploaded to Hackage? I think I've updated all mine for the
new text version dependency, but I'm worried I forgot some.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe