Sune, Yuki, and anyone else:
What are your thoughts? So far the concensus seems to be leaning toward
using wstring directly instead of indirectly via #ifdef+typedefs.
--Stefan
On Mon, Nov 30, 2009 at 07:05, Steve Borho <st...@borho.org> wrote:
> On Fri, Nov 27, 2009 at 3:16 PM, Adrian Buehlmann <adr...@cadifra.com>
> wrote:
> >
> >
> > On 27.11.2009 18:44, Stefan Rusek wrote:
> >> Please see
> http://bitbucket.org/tortoisehg/stable/issue/672/shell-extension-unicode-support
> >> for more context.
> >>
> >> I am working to add support for unicode filenames to the shell
> >> extension. Out of the box, THG currently uses the non-unicode api
> >> calls in the shell extension. This works because on windows hg uses
> >> the same non-unicode api calls, and because of it's built on hg, hgtk
> >> also ends up using the non-unicode api calls. The fixutf8 hg extension
> >> wraps all of the disk io calls with their unicode equivalent in order
> >> add support for unicode filenames on windows. With the extension
> >> enabled, both hg and hgtk work properly with unicode filenames,
> >> however the shell extension does not.
> >>
> >> The plan is to have the TortoiseHg RPC server pass the value of
> >> mercurial._encoding to tortoisehg via the thgstatus file, so that the
> >> shell extension knows how to read both the thgstatus and dirstate
> >> files.
> >>
> >> The one issue left to sort out is more of an issue of style. Currently
> >> std::string is used throughout the shell extension for storing
> >> filenames and paths. We could switch to std:wstring for all paths, or
> >> we should do an #ifdef UNICODE and make it so that the shell extension
> >> could be compiled in either unicode or non-unicode.
> >>
> >> I was originally in favor of the #ifdef approach, but there wouldn't
> >> be any advantage to compiling compiling to non-unicode, since Windows
> >> uses only unicode under the hood, so when we get filenames from
> >> windows, the filename gets converted to non-unicode, and then when we
> >> call the non-unicode version of CreateProcess to spawn hgtk windows
> >> automatically converts the command-line we pass in to unicode.
> >> Additionally, it effectively creates two versions of the shell
> >> extension that would need to be supported.
> >>
> >
> > Given my limited understanding of encoding issues, this statement
> > may have a high risk of shooting in my own food, but...
> >
> > I would say switch to std::wstring for all file paths and don't
> > use #ifdef UNICODE's.
> >
> > The recoding from what's in .hg/dirstate to std::wstring should then
> > happen when reading the .hg/dirstate into the shell extension's data
> > structures in memory.
> >
> > All file paths in memory should then be assumed to be encoded
> > in whatever Windows' encoding for wide character string filenames
> > is.
> >
> > It looks like this is UTF-16:
> > http://msdn.microsoft.com/en-us/library/dd374081(VS.85).aspx
> >
> > So care must for example be taken when splitting a path
> > into its parts (splitting on '\'), which is done when reading
> > the dirstate in the current code.
>
> I don't see a need to use #ifdefs throughout the code either. I'm
> guessing we'll want to bundle the fixutf8 extension as well so it can
> be easily enabled.
>
> --
> Steve Borho
>
------------------------------------------------------------------------------
Join us December 9, 2009 for the Red Hat Virtual Experience,
a free event focused on virtualization and cloud computing.
Attend in-depth sessions from your desk. Your couch. Anywhere.
http://p.sf.net/sfu/redhat-sfdev2dev
_______________________________________________
Tortoisehg-develop mailing list
Tortoisehg-develop@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/tortoisehg-develop