I thought the same thing, perhaps windows had some kind of standard 
"Windows NFC" set up that running the python NFC messed the string up 
somehow.

On Sunday, 24 January 2021 at 23:32:56 UTC+1 justin...@gmail.com wrote:

> Between my work experience in the US and New Zealand, I have been lucky 
> enough to avoid complicated unicode handling. That is, until more recently 
> when dealing with python2 to python3 code updates. So my experience in all 
> of the nitty-gritty conversions and normalizations has been pretty low. But 
> reading over the docs on that unicodedata.normalize() 
> <https://docs.python.org/3/library/unicodedata.html#unicodedata.normalize> 
> function, it talks about choosing between different forms and how single 
> unicode characters can be represented by different codes under different 
> forms. So this makes me wonder if choosing to normalize with the wrong form 
> could make a string that is visually the same actually be a different 
> filepath? Your tests seems to indicate that it might be the case, when 
> seeing different results with different unicode forms. I have no idea how 
> to go about choosing the right form in terms of still matching the windows 
> filesystem.
>
>
> On Mon, Jan 25, 2021 at 10:51 AM Benjam901 <benandr...@gmail.com> wrote:
>
>> A further update:
>>
>> I have amended my function only to normalize NFC the path if it detects 
>> that it is on Mac. Without this it all seems totally fine on Windows, I 
>> have sent a newer build out for testing and will let the community know if 
>> it works out ok.
>>
>> Whatever was going on with the string on windows when normalized was 
>> making it behave double strange...
>>
>> On Sunday, 24 January 2021 at 22:15:40 UTC+1 Benjam901 wrote:
>>
>>> An update, if I run an *os.path.exists(file_path)* on the path in 
>>> question it returns False... however if I *unicode.normalize('NFD', 
>>> file_path)* and run the same query it returns True. 
>>>
>>> After some more testing with my wildchars folder it appears that NFC and 
>>> NFD will work based on what is in the string. 
>>> For example my wildchars folder has filenames like so which all work 
>>> fine with normalize('NFC'):
>>> pîrvu - aлфавит 44.wav
>>> Triptil - c  hÉr , c  mÉr .wav
>>> Venda - 1.2 [pământ].wav
>>>
>>> However this path in the error example will ONLY work when composed 
>>> into NFD unicode form
>>>
>>> 06 - brüc - Second Live.flac
>>>
>>> Am I missing something with my character encoding perhaps? 
>>>
>>> // Ben
>>>
>>> On Sunday, 24 January 2021 at 21:48:53 UTC+1 Benjam901 wrote:
>>>
>>>> I have a repro case now with one of the beta users files. I can run 
>>>> some tests on the path in question now!
>>>>
>>>> On Sunday, 24 January 2021 at 21:32:12 UTC+1 Benjam901 wrote:
>>>>
>>>>> Hey Justin,
>>>>>
>>>>> My bad, here is some more detail. The reason this is implemented is 
>>>>> without it, database entries that have for example swedish characters 
>>>>>
>>>>> *Adam Strömsted*come out looking like this
>>>>> [image: Screenshot 2021-01-24 at 21.30.39.png]
>>>>>
>>>>> *Function:*
>>>>>
>>>>>
>>>>> *from unicodedata import normalizedef normalize_filepath(file_path):   
>>>>>   return normalize('NFC', file_path)*
>>>>>
>>>>> I was thinking about using pathlib module, I have read it can help 
>>>>> with a lot of issues such as file encoding but I was worried about the 
>>>>> speed difference between it and scandir which is built into the os.walk 
>>>>> module. 
>>>>>
>>>>> What are your thoughts on making the change?
>>>>>
>>>>> // Ben
>>>>>
>>>>> On Sunday, 24 January 2021 at 19:29:44 UTC+1 justin...@gmail.com 
>>>>> wrote:
>>>>>
>>>>>> What is the implementation of your custom normalize function? I see 
>>>>>> you are calling it, and mentioned it might be the cause of your 
>>>>>> problems, 
>>>>>> but don't see where it is defined. Does it simply call os.path.normpath? 
>>>>>> Have you tried py3 pathlib 
>>>>>> <https://docs.python.org/3/library/pathlib.html>to handle all of 
>>>>>> this, instead of os and manual string formatting? 
>>>>>> Can you easily reproduce the problem with just some simple operations 
>>>>>> on that one problematic path? For me, it seems valid in both linux and 
>>>>>> windows, so something must be going on in the string conversion as you 
>>>>>> suggested.
>>>>>>
>>>>>> Justin
>>>>>>
>>>>>>
>>>>>> On Mon, Jan 25, 2021 at 3:56 AM Benjam901 <benandr...@gmail.com> 
>>>>>> wrote:
>>>>>>
>>>>>>> Here is the error that the user has on their machine and happens 
>>>>>>> when trying to setup a data container for the path. The reason I am 
>>>>>>> confused about this is because the path exists and was found when I 
>>>>>>> walked 
>>>>>>> the users dir path...
>>>>>>>
>>>>>>> *FileNotFoundError: [WinError 2] The system cannot find the file 
>>>>>>> specified: 'G:/De sortat 2020 apr-dec/06 - brüc - Second Live.flac'*
>>>>>>>
>>>>>>> On Sunday, 24 January 2021 at 15:38:01 UTC+1 Benjam901 wrote:
>>>>>>>
>>>>>>>> Hello community,
>>>>>>>>
>>>>>>>> I am having some trouble with my project inside of the windows 
>>>>>>>> build, this has happened twice now with 2 separate users so it needs 
>>>>>>>> to be 
>>>>>>>> addressed. The thing is, I find this very weird...
>>>>>>>>
>>>>>>>> The user inputs a directory they wish to iterate, and I use good 
>>>>>>>> old os.walk, replace backslashes with forward and then I normalize the 
>>>>>>>> path 
>>>>>>>> in order to make sure the sql database is fine with displaying the 
>>>>>>>> special 
>>>>>>>> chars, which I think might be the cause of my problems: 
>>>>>>>> *normalize('NFC', 
>>>>>>>> file_path)*
>>>>>>>>
>>>>>>>> Mac has 0 issues with this but Windows is another story.
>>>>>>>>
>>>>>>>> Is there a fool proof way to encode characters for both platforms 
>>>>>>>> that I am missing?
>>>>>>>>
>>>>>>>> Here is the gist of the simple function that gathers the paths:
>>>>>>>> https://gist.github.com/a626215d95f1b292e66b388be3c1eb2b
>>>>>>>>
>>>>>>>> Cheers,
>>>>>>>>
>>>>>>>> Ben
>>>>>>>>
>>>>>>> -- 
>>>>>>> You received this message because you are subscribed to the Google 
>>>>>>> Groups "Python Programming for Autodesk Maya" group.
>>>>>>> To unsubscribe from this group and stop receiving emails from it, 
>>>>>>> send an email to python_inside_m...@googlegroups.com.
>>>>>>> To view this discussion on the web visit 
>>>>>>> https://groups.google.com/d/msgid/python_inside_maya/f3866246-3f0d-4d09-bcf3-6966610fc3dfn%40googlegroups.com
>>>>>>>  
>>>>>>> <https://groups.google.com/d/msgid/python_inside_maya/f3866246-3f0d-4d09-bcf3-6966610fc3dfn%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>>>> .
>>>>>>>
>>>>>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "Python Programming for Autodesk Maya" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to python_inside_m...@googlegroups.com.
>>
> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/python_inside_maya/e9d184aa-5682-4e4e-a18c-92c8c6c2b510n%40googlegroups.com
>>  
>> <https://groups.google.com/d/msgid/python_inside_maya/e9d184aa-5682-4e4e-a18c-92c8c6c2b510n%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Python Programming for Autodesk Maya" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to python_inside_maya+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/python_inside_maya/69eb11d6-fa70-4226-8dda-8e988f0bf324n%40googlegroups.com.

Reply via email to