On Thu, May 27, 2010 at 12:15:56 +0000, Petr Ročkai wrote:
> Thu May 27 14:10:19 CEST 2010  Petr Rockai <m...@mornfall.net>
>   * Resolve issue1763: use correct filename encoding in conflictors.

OK, we've had two people (Reinier, Eric) look at this and OK it, so I guess it
makes sense for me to push it now with some thoughts about future work.

Resolve issue1763: use correct filename encoding in conflictors.
----------------------------------------------------------------
> hunk ./src/Darcs/Patch/Real.hs 716
>          blueText "conflictor" <+> showNons i <+> blueText "[]" $$ showNon p
>      showPatch (Conflictor i cs p) =
>          blueText "conflictor" <+> showNons i <+> blueText "[" $$
> -        showPatch cs $$
> +        showPrimFL NewFormat cs $$
>          blueText "]" $$
>          showNon p
>      showPatch (InvConflictor i NilFL p) =
> 

I'm still concerned that we're not being systematic enough about really
fixing this (eg. show we worry about rotcifnoc? showNon? etc)

[The mental image I have is those old cartoons where you have the
 character on a boat and a leak forms, so he plugs it with a finger,
 and then another leak, and another finger, and another leak...]

I also notice this:

  instance ReadPatch Prim where
    readPatch' _ = readPrim OldFormat

  -- this and other darcs-2 format patches use readPrim NewFormat
  readNons :: (ReadPatch p, ParserM m) => m [Non p C(x)]
  readNons = peekfor "{{" rns (return [])
      where rns = peekfor "}}" (return []) $
                  do Just (Sealed ps) <- readPatch' False
                     lexChar ':'
                     Just (Sealed p) <- readPrim NewFormat
                     (Non ps p :) `liftM` rns
     

and in the read code for Non and RealPatch (I think these are darcs-2 style
patches), readPatch eventually uses readPrim NewFormat.  So that makes sense:
the double-encoding comes from reading UTF-8 bytes [this is where Petr's
assertion that "the filepath *is never decoded*" makes sense] as code-points,
and then trying to encode those code-points into UTF-8 bytes.

Plan for future work? (Prim FileNameFormat)
-------------------------------------------
How does this plan sound: introduce two new wrapper types OldFormatPrim and
NewFormatPrim whose read/show instances use OldFormat/NewFormat
respectively, thus ensuring that readPatch and showPatch automagically
do the right thing?

(or even one parametrisable type (although I imagine that involves turning
on some extension for instances))

Plan for future work? (two kinds of read/show)
----------------------------------------------
Complementary plan: we should distinguish between decoding/encoding
filepaths from the operating system, and decoding/encoding filepaths
to patch files and patch bundles.

Basically the picture looks like this:

    OS <--> darcs <---> patch files

The reason why I initially thought that NewFormat was a step backwards
was that I was thinking about the darcs <--> patch files part.  IMHO,
what you want is for darcs <--> patch files to always use UTF-8.  On the
other hand, the OS <--> darcs part needs some more thought.

This is a little half-baked right, but maybe somebody else can run with
the idea?

-- 
Eric Kow <http://www.nltg.brighton.ac.uk/home/Eric.Kow>
PGP Key ID: 08AC04F9

Attachment: signature.asc
Description: Digital signature

_______________________________________________
darcs-users mailing list
darcs-users@darcs.net
http://lists.osuosl.org/mailman/listinfo/darcs-users

Reply via email to