Re[2]: [Haskell-cafe] Re: Unicode workaround for getDirectoryContents under Windows?

2009-06-17 Thread Bulat Ziganshin
Hello Simon,

Wednesday, June 17, 2009, 11:55:15 AM, you wrote:

 Right, so getArgs is already fine.

it's what i've found in Jun15 sources:

#ifdef __GLASGOW_HASKELL__
getArgs :: IO [String]
getArgs =
  alloca $ \ p_argc -
  alloca $ \ p_argv - do
   getProgArgv p_argc p_argv
   p- fromIntegral `liftM` peek p_argc
   argv - peek p_argv
   peekArray (p - 1) (advancePtr argv 1) = mapM peekCString


foreign import ccall unsafe getProgArgv
  getProgArgv :: Ptr CInt - Ptr (Ptr CString) - IO ()


it uses peekCString so by any means it cannot produce unicode chars


-- 
Best regards,
 Bulatmailto:bulat.zigans...@gmail.com

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re[2]: [Haskell-cafe] Re: Unicode workaround for getDirectoryContents under Windows?

2009-06-17 Thread Bulat Ziganshin
Hello Simon,

Wednesday, June 17, 2009, 12:01:11 PM, you wrote:

 foreign import stdcall unsafe GetFullPathNameW
c_GetFullPathName :: LPCTSTR - DWORD - LPTSTR - Ptr LPTSTR - IO DWORD

you are right, i was troubled by unused GetFullPathNameA import in 
System.Directory:

#if defined(mingw32_HOST_OS)
foreign import stdcall unsafe GetFullPathNameA
c_GetFullPathName :: CString
  - CInt
  - CString
  - Ptr CString
  - IO CInt
#else
foreign import ccall unsafe realpath
   c_realpath :: CString
  - CString
  - IO CString
#endif



 So as you can see, there's not much left to do.  I'll fix openFile.

c_stat is widely used here and there. it may be that half of
System.Directory functions is broken due to direct or indirect calls
to this function



-- 
Best regards,
 Bulatmailto:bulat.zigans...@gmail.com

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re[2]: [Haskell-cafe] Re: Unicode workaround for getDirectoryContents under Windows?

2009-06-17 Thread Bulat Ziganshin
Hello Simon,

Wednesday, June 17, 2009, 12:46:49 PM, you wrote:

 I see, so you were previously quoting code from some other source.

from my program

 Where did the GetCommandLineW version come from?  Do you know of any 
 issues that would prevent us using it in GHC?

it should be as fine as any other *W calls. the only thing is that we
may prefer to include in into Win32 package as other routines and then
call from there


-- 
Best regards,
 Bulatmailto:bulat.zigans...@gmail.com

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re[2]: [Haskell-cafe] Re: Unicode workaround for getDirectoryContents under Windows?

2009-06-16 Thread Bulat Ziganshin
Hello Simon,

Tuesday, June 16, 2009, 4:34:29 PM, you wrote:

 Thanks for reminding me that openFile is also broken.  It's easily
 fixed, so I'll look into that.

i fear that it will leave GHC libs in inconsistent state that can
drive users mad. now at least there are some rules of brokeness. when
some functions will be unicode-aware and some ansi codepaged, and this
may chnage in every version, this unicode support will become
completely useless. it will be like floating Base situation when it's
impossible to write programs against Base since it's each time different

also, i think that the best way to fix windows compatibility is to
provide smth like this:

#if WINDOWS

type CWFilePath   = LPCTSTR   -- filename in C land
type CWFileOffset = Int64 -- filesize or filepos in C land
withCWFilePath = withTString  -- FilePath-CWFilePath conversion
peekCWFilePath = peekTString  -- CWFilePath-FilePath conversion

#else

type CWFilePath   = CString
type CWFileOffset = COff
withCWFilePath = withCString
peekCWFilePath = peekCString

#endif

and then systematically rewrite all string-related OS API calls using
these definitions

how much meaning will be to have openFile and getDirContents
unicode-aware, if deleteFile and even getFileStat aren't unicode-aware?


i've attached my own internal module that makes this job for my own
program - just for reference


-- 
Best regards,
 Bulatmailto:bulat.zigans...@gmail.com

Win32Files.hs
Description: Binary data
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re[2]: [Haskell-cafe] Re: Unicode workaround for getDirectoryContents under Windows?

2009-06-16 Thread Bulat Ziganshin
Hello Simon,

Tuesday, June 16, 2009, 5:02:43 PM, you wrote:

 Also currently broken:

   * calling removeFile on a FilePath you get from getDirectoryContents,
 amongst other System.Directory operations

 Fixing getDirectoryContents will fix these.

no. removeFile like anything else also uses ACP-based api

 I don't know how getArgs fits in here - should we be decoding argv using
 the ACP?

well, the whole story: windows internally uses Unicode for handling
strings. externally, it provides 2 API families:

1) A-family (such as CreateFileA) uses 8-bit char-based strings.
these strings are encoded using current locale. First 128 chars are
common for all codepages, providing ASCII char set, higher 128 chars
are locale-specific. say, for German locale, it provides chars with
umlauts, for Russian locale - cyrillic chars

2) W-family (such as CreateFileW) uses UTF-16 encoded 16-bit
wchar-based strings, which are locale-independent


Windows libraries emulates POSIX API (open, opendir, stat and so on)
by translating these (char-based) calls into A-family. GHC libs are
written Unix way, so these are effectively bundled to A-family of Win
API

Windows libraries also provides w* variant of POSIX API (wopen,
wopendir, wstat...) that uses UTF-16 encoded 16-bit wchar-based
strings, so for proper handling of Unicode strings (filenames, cmdline
arguments) we should use these APIs


my old proposal: http://haskell.org/haskellwiki/Library/IO



-- 
Best regards,
 Bulatmailto:bulat.zigans...@gmail.com

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re[2]: [Haskell-cafe] Re: Unicode workaround for getDirectoryContents under Windows?

2009-06-16 Thread Bulat Ziganshin
Hello Simon,

Tuesday, June 16, 2009, 7:30:55 PM, you wrote:

 Actually we use a mixture of CRT functions and native Windows API,
 gradually moving in the direction of the latter.

so file-related APIs are already unpredictable, and will remain in
this state for unknown amount of ghc versions


-- 
Best regards,
 Bulatmailto:bulat.zigans...@gmail.com

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re[2]: [Haskell-cafe] Re: Unicode workaround for getDirectoryContents under Windows?

2009-06-16 Thread Bulat Ziganshin
Hello Simon,

Tuesday, June 16, 2009, 7:54:02 PM, you wrote:

 In fact there's not a lot left to convert in System.Directory, as you'll
 see if you look at the code.  Feel like helping?

these functions used there are ACP-only:

c_stat c_chmod System.Win32.getFullPathName c_SearchPath c_SHGetFolderPath

plus may be some more functions from System.Win32 package - i don't
looked into it



-- 
Best regards,
 Bulatmailto:bulat.zigans...@gmail.com

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re[2]: [Haskell-cafe] Re: Unicode workaround for getDirectoryContents under Windows?

2009-06-16 Thread Bulat Ziganshin
Hello Simon,

Tuesday, June 16, 2009, 5:02:43 PM, you wrote:

 I don't know how getArgs fits in here - should we be decoding argv using
 the ACP?

myGetArgs = do
   alloca $ \p_argc - do
   p_argv_w - commandLineToArgvW getCommandLineW p_argc
   argc - peek p_argc
   argv_w   - peekArray (i argc) p_argv_w
   mapM peekTString argv_w = return.tail

foreign import stdcall unsafe windows.h GetCommandLineW
  getCommandLineW :: LPTSTR

foreign import stdcall unsafe windows.h CommandLineToArgvW
  commandLineToArgvW :: LPCWSTR - Ptr CInt - IO (Ptr LPWSTR)



-- 
Best regards,
 Bulatmailto:bulat.zigans...@gmail.com

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe