On Tue, Nov 10, 2009 at 8:20 PM, Gokul P. Nair <[email protected]> wrote:
> --- On Sat, 11/7/09, Don Stewart <[email protected]> wrote: > > General notes: > > > > * unpack is almost always wrong. > > * list indexing with !! is almost always wrong. > > * words/lines are often wrong for parsing large files (they build large > list structures). > > * toList/fromList probably aren't the best strategy > > * sortBy (comparing snd) > > * use insertWith' > > Spefically, avoid constructing intermediate lists, when you can process > the > > entire file in a single pass. Use O(1) bytestring substring operations > like > > take and drop. > > Thanks all for the valuable feedback. Switching from Regex.Posix to > Regex.PCRE alone reduced the running time to about 6 secs and a few other > optimizations suggested on this thread brought it down to about 5 secs ;) > > I then set out to profile the code out of curiosity to see where the bulk > of the time was being spent and sure enough the culprit turned out to be > "unpack". My question therefore is, given a list L1 of type [(ByteString, > Int)], how do I print it out so as to eliminate the "chunk, empty" markers > associated with a bytestring? The suggestions posted here are along the > lines of "mapM_ print L1" but that's far from desirable especially because > the generated output is for perusal by non-technical users etc. > > Thanks. > > Take a look at Data.ByteString.Lazy.Char8.putStrLn. That prints a lazy ByteString without unpacking it, and without the internal markers. Sincerely, Brad
_______________________________________________ Haskell-Cafe mailing list [email protected] http://www.haskell.org/mailman/listinfo/haskell-cafe
