So I made an interesting finding. The file in question has a name 
de_DE_ö_frami.aff. 

I cloned the repo and made some experiments in the same dir as the file. First 
I tried usual tricks with .encode('utf-8') and the like, but it didn't help. 
But then it struck me:

    In [6]: path = os.listdir('.')[-11]

    In [7]: path
    Out[7]: 'de_DE_\xf6_frami.aff'

    In [8]: print path
    de_DE_�_frami.aff

    n [12]: "de_DE_ö_frami.aff" == 'de_DE_\xf6_frami.aff'
    Out[12]: False

    In [13]: os.listdir(u'.')[-11]
    Out[13]: 'de_DE_\xf6_frami.aff'

    In [14]: "de_DE_ö_frami.aff"
    Out[14]: 'de_DE_\xc3\xb6_frami.aff'

So this seems like python's os.listdir reports the filename incorrectly! I 
experimented with cyrillic file names and found no problems

    In [8]: os.listdir('.')[-8]
    Out[8]: '\xd0\xbf\xd1\x80\xd0\xb8\xd0\xb2\xd0\xb5\xd1\x82'

    In [9]: print '\xd0\xbf\xd1\x80\xd0\xb8\xd0\xb2\xd0\xb5\xd1\x82'
    привет

    In [10]: os.listdir(u'.')[-8]
    Out[10]: u'\u043f\u0440\u0438\u0432\u0435\u0442'

    In [12]: print os.listdir(u'.')[-8]
    привет

Conclusion: we have a strange rare bug with python's os module scrambling 
unicode filenames. 


---

** [tickets:#7757] UnicodeDecodeError when generating code snapshot on hg repo**

**Status:** in-progress
**Milestone:** unreleased
**Labels:** support sf-current 42cc sf-1 
**Created:** Fri Oct 10, 2014 03:14 PM UTC by Anonymous
**Last Updated:** Tue Jun 30, 2015 05:13 PM UTC
**Owner:** Igor Bondarenko

*Originally created by:* jwb1980

https://sourceforge.net/p/forge/site-support/8700/

----

[forge:site-support:#8700]


----

From IRC #sourceForge
download the source code of this project 
https://sourceforge.net/p/nhunspell/code/ci/default/tree/
3:55 When I try the snapshot Sourceforge says "We're having trouble finding 
that snapshot. Would you like to resubmit?"
3:55 TortoiseSVN gives me error 500 in my fork repository

----




---

Sent from forge-allura.apache.org because [email protected] is subscribed 
to https://forge-allura.apache.org/p/allura/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://forge-allura.apache.org/p/allura/admin/tickets/options.  Or, if this is 
a mailing list, you can unsubscribe from the mailing list.

Reply via email to