On Thu, May 27, 2010 at 12:15:56 +0000, Petr Ročkai wrote: > Thu May 27 14:10:19 CEST 2010 Petr Rockai <m...@mornfall.net> > * Resolve issue1763: use correct filename encoding in conflictors.
OK, we've had two people (Reinier, Eric) look at this and OK it, so I guess it makes sense for me to push it now with some thoughts about future work. Resolve issue1763: use correct filename encoding in conflictors. ---------------------------------------------------------------- > hunk ./src/Darcs/Patch/Real.hs 716 > blueText "conflictor" <+> showNons i <+> blueText "[]" $$ showNon p > showPatch (Conflictor i cs p) = > blueText "conflictor" <+> showNons i <+> blueText "[" $$ > - showPatch cs $$ > + showPrimFL NewFormat cs $$ > blueText "]" $$ > showNon p > showPatch (InvConflictor i NilFL p) = > I'm still concerned that we're not being systematic enough about really fixing this (eg. show we worry about rotcifnoc? showNon? etc) [The mental image I have is those old cartoons where you have the character on a boat and a leak forms, so he plugs it with a finger, and then another leak, and another finger, and another leak...] I also notice this: instance ReadPatch Prim where readPatch' _ = readPrim OldFormat -- this and other darcs-2 format patches use readPrim NewFormat readNons :: (ReadPatch p, ParserM m) => m [Non p C(x)] readNons = peekfor "{{" rns (return []) where rns = peekfor "}}" (return []) $ do Just (Sealed ps) <- readPatch' False lexChar ':' Just (Sealed p) <- readPrim NewFormat (Non ps p :) `liftM` rns and in the read code for Non and RealPatch (I think these are darcs-2 style patches), readPatch eventually uses readPrim NewFormat. So that makes sense: the double-encoding comes from reading UTF-8 bytes [this is where Petr's assertion that "the filepath *is never decoded*" makes sense] as code-points, and then trying to encode those code-points into UTF-8 bytes. Plan for future work? (Prim FileNameFormat) ------------------------------------------- How does this plan sound: introduce two new wrapper types OldFormatPrim and NewFormatPrim whose read/show instances use OldFormat/NewFormat respectively, thus ensuring that readPatch and showPatch automagically do the right thing? (or even one parametrisable type (although I imagine that involves turning on some extension for instances)) Plan for future work? (two kinds of read/show) ---------------------------------------------- Complementary plan: we should distinguish between decoding/encoding filepaths from the operating system, and decoding/encoding filepaths to patch files and patch bundles. Basically the picture looks like this: OS <--> darcs <---> patch files The reason why I initially thought that NewFormat was a step backwards was that I was thinking about the darcs <--> patch files part. IMHO, what you want is for darcs <--> patch files to always use UTF-8. On the other hand, the OS <--> darcs part needs some more thought. This is a little half-baked right, but maybe somebody else can run with the idea? -- Eric Kow <http://www.nltg.brighton.ac.uk/home/Eric.Kow> PGP Key ID: 08AC04F9
signature.asc
Description: Digital signature
_______________________________________________ darcs-users mailing list darcs-users@darcs.net http://lists.osuosl.org/mailman/listinfo/darcs-users