Avoid unpack!
ndmitchell: > Hi Gwern, > > I get String/Data.Binary issues too. My suggestion would be to change > your strings to ByteString's, serisalise, and then do the reverse > conversion when reading. Interestingly, a String and a ByteString have > identical Data.Binary reps, but in my experiments converting, > including the cost of BS.unpack, makes the reading substantially > cheaper. > > Thanks > > Neil > > On Thu, Mar 5, 2009 at 2:33 AM, Gwern Branwen <gwe...@gmail.com> wrote: > > On Tue, Mar 3, 2009 at 11:50 PM, Spencer Janssen > > <spencerjans...@gmail.com> wrote: > >> On Tue, Mar 3, 2009 at 10:30 PM, Gwern Branwen <gwe...@gmail.com> wrote: > >>> So recently I've been having issues with Data.Binary & Data.Sequence; > >>> I serialize a 'Seq String' > >>> > >>> You can see the file here: http://code.haskell.org/yi/Yi/IReader.hs > >>> > >>> The relevant function seems to be: > >>> > >>> -- | Read in database from 'dbLocation' and then parse it into an > >>> 'ArticleDB'. > >>> readDB :: YiM ArticleDB > >>> readDB = io $ (dbLocation >>= r) `catch` (\_ -> return empty) > >>> where r x = fmap (decode . BL.fromChunks . return) $ B.readFile x > >>> -- We read in with strict bytestrings to guarantee the > >>> file is closed, > >>> -- and then we convert it to the lazy bytestring > >>> data.binary expects. > >>> -- This is inefficient, but alas... > >>> > >>> My current serialized file is about 9.4M. I originally thought that > >>> the issue might be the recent upgrade in Yi to binary 0.5, but I > >>> unpulled patches back to past that, and the problem still manifested. > >>> > >>> Whenever yi tries to read the articles.db file, it stack overflows. It > >>> actually stack-overflowed on even smaller files, but I managed to bump > >>> the size upwards, it seems, by the strict-Bytestring trick. > >>> Unfortunately, my personal file has since passed whatever that limit > >>> was. > >>> > >>> I've read carefully the previous threads on Data.Binary and Data.Map > >>> stack-overflows, but none of them seem to help; hacking some $!s or > >>> seqs into readDB seems to make no difference, and Seq is supposed to > >>> be a strict datastructure already! Doing things in GHCi has been > >>> tedious, and hasn't enlightened me much: sometimes things overflow and > >>> sometimes they don't. It's all very frustrating and I'm seriously > >>> considering going back to using the original read/show code unless > >>> anyone knows how to fix this - that approach may be many times slower, > >>> but I know it will work. > >>> > >>> -- > >>> gwern > >> > >> Have you tried the darcs version of binary? It has a new instance > >> which looks more efficient than the old. > >> > >> > >> Cheers, > >> Spencer Janssen > > > > I have. It still stack-overflows on my 9.8 meg file. (The magic number > > seems to be somewhere between 9 and 10 megabytes.) > > > > -- > > gwern > > _______________________________________________ > > Haskell-Cafe mailing list > > Haskell-Cafe@haskell.org > > http://www.haskell.org/mailman/listinfo/haskell-cafe > > > _______________________________________________ > Haskell-Cafe mailing list > Haskell-Cafe@haskell.org > http://www.haskell.org/mailman/listinfo/haskell-cafe > _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe