Re[2]: [Haskell-cafe] Re: Unicode workaround for getDirectoryContents under Windows?
Hello Simon, Wednesday, June 17, 2009, 11:55:15 AM, you wrote: Right, so getArgs is already fine. it's what i've found in Jun15 sources: #ifdef __GLASGOW_HASKELL__ getArgs :: IO [String] getArgs = alloca $ \ p_argc - alloca $ \ p_argv - do getProgArgv p_argc p_argv p- fromIntegral `liftM` peek p_argc argv - peek p_argv peekArray (p - 1) (advancePtr argv 1) = mapM peekCString foreign import ccall unsafe getProgArgv getProgArgv :: Ptr CInt - Ptr (Ptr CString) - IO () it uses peekCString so by any means it cannot produce unicode chars -- Best regards, Bulatmailto:bulat.zigans...@gmail.com ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re[2]: [Haskell-cafe] Re: Unicode workaround for getDirectoryContents under Windows?
Hello Simon, Wednesday, June 17, 2009, 12:01:11 PM, you wrote: foreign import stdcall unsafe GetFullPathNameW c_GetFullPathName :: LPCTSTR - DWORD - LPTSTR - Ptr LPTSTR - IO DWORD you are right, i was troubled by unused GetFullPathNameA import in System.Directory: #if defined(mingw32_HOST_OS) foreign import stdcall unsafe GetFullPathNameA c_GetFullPathName :: CString - CInt - CString - Ptr CString - IO CInt #else foreign import ccall unsafe realpath c_realpath :: CString - CString - IO CString #endif So as you can see, there's not much left to do. I'll fix openFile. c_stat is widely used here and there. it may be that half of System.Directory functions is broken due to direct or indirect calls to this function -- Best regards, Bulatmailto:bulat.zigans...@gmail.com ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re[2]: [Haskell-cafe] Re: Unicode workaround for getDirectoryContents under Windows?
Hello Simon, Wednesday, June 17, 2009, 12:46:49 PM, you wrote: I see, so you were previously quoting code from some other source. from my program Where did the GetCommandLineW version come from? Do you know of any issues that would prevent us using it in GHC? it should be as fine as any other *W calls. the only thing is that we may prefer to include in into Win32 package as other routines and then call from there -- Best regards, Bulatmailto:bulat.zigans...@gmail.com ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re[2]: [Haskell-cafe] Re: Unicode workaround for getDirectoryContents under Windows?
Hello Simon, Tuesday, June 16, 2009, 4:34:29 PM, you wrote: Thanks for reminding me that openFile is also broken. It's easily fixed, so I'll look into that. i fear that it will leave GHC libs in inconsistent state that can drive users mad. now at least there are some rules of brokeness. when some functions will be unicode-aware and some ansi codepaged, and this may chnage in every version, this unicode support will become completely useless. it will be like floating Base situation when it's impossible to write programs against Base since it's each time different also, i think that the best way to fix windows compatibility is to provide smth like this: #if WINDOWS type CWFilePath = LPCTSTR -- filename in C land type CWFileOffset = Int64 -- filesize or filepos in C land withCWFilePath = withTString -- FilePath-CWFilePath conversion peekCWFilePath = peekTString -- CWFilePath-FilePath conversion #else type CWFilePath = CString type CWFileOffset = COff withCWFilePath = withCString peekCWFilePath = peekCString #endif and then systematically rewrite all string-related OS API calls using these definitions how much meaning will be to have openFile and getDirContents unicode-aware, if deleteFile and even getFileStat aren't unicode-aware? i've attached my own internal module that makes this job for my own program - just for reference -- Best regards, Bulatmailto:bulat.zigans...@gmail.com Win32Files.hs Description: Binary data ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re[2]: [Haskell-cafe] Re: Unicode workaround for getDirectoryContents under Windows?
Hello Simon, Tuesday, June 16, 2009, 5:02:43 PM, you wrote: Also currently broken: * calling removeFile on a FilePath you get from getDirectoryContents, amongst other System.Directory operations Fixing getDirectoryContents will fix these. no. removeFile like anything else also uses ACP-based api I don't know how getArgs fits in here - should we be decoding argv using the ACP? well, the whole story: windows internally uses Unicode for handling strings. externally, it provides 2 API families: 1) A-family (such as CreateFileA) uses 8-bit char-based strings. these strings are encoded using current locale. First 128 chars are common for all codepages, providing ASCII char set, higher 128 chars are locale-specific. say, for German locale, it provides chars with umlauts, for Russian locale - cyrillic chars 2) W-family (such as CreateFileW) uses UTF-16 encoded 16-bit wchar-based strings, which are locale-independent Windows libraries emulates POSIX API (open, opendir, stat and so on) by translating these (char-based) calls into A-family. GHC libs are written Unix way, so these are effectively bundled to A-family of Win API Windows libraries also provides w* variant of POSIX API (wopen, wopendir, wstat...) that uses UTF-16 encoded 16-bit wchar-based strings, so for proper handling of Unicode strings (filenames, cmdline arguments) we should use these APIs my old proposal: http://haskell.org/haskellwiki/Library/IO -- Best regards, Bulatmailto:bulat.zigans...@gmail.com ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re[2]: [Haskell-cafe] Re: Unicode workaround for getDirectoryContents under Windows?
Hello Simon, Tuesday, June 16, 2009, 7:30:55 PM, you wrote: Actually we use a mixture of CRT functions and native Windows API, gradually moving in the direction of the latter. so file-related APIs are already unpredictable, and will remain in this state for unknown amount of ghc versions -- Best regards, Bulatmailto:bulat.zigans...@gmail.com ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re[2]: [Haskell-cafe] Re: Unicode workaround for getDirectoryContents under Windows?
Hello Simon, Tuesday, June 16, 2009, 7:54:02 PM, you wrote: In fact there's not a lot left to convert in System.Directory, as you'll see if you look at the code. Feel like helping? these functions used there are ACP-only: c_stat c_chmod System.Win32.getFullPathName c_SearchPath c_SHGetFolderPath plus may be some more functions from System.Win32 package - i don't looked into it -- Best regards, Bulatmailto:bulat.zigans...@gmail.com ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re[2]: [Haskell-cafe] Re: Unicode workaround for getDirectoryContents under Windows?
Hello Simon, Tuesday, June 16, 2009, 5:02:43 PM, you wrote: I don't know how getArgs fits in here - should we be decoding argv using the ACP? myGetArgs = do alloca $ \p_argc - do p_argv_w - commandLineToArgvW getCommandLineW p_argc argc - peek p_argc argv_w - peekArray (i argc) p_argv_w mapM peekTString argv_w = return.tail foreign import stdcall unsafe windows.h GetCommandLineW getCommandLineW :: LPTSTR foreign import stdcall unsafe windows.h CommandLineToArgvW commandLineToArgvW :: LPCWSTR - Ptr CInt - IO (Ptr LPWSTR) -- Best regards, Bulatmailto:bulat.zigans...@gmail.com ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe