So I made an interesting finding. The file in question has a name
de_DE_ö_frami.aff.
I cloned the repo and made some experiments in the same dir as the file. First
I tried usual tricks with .encode('utf-8') and the like, but it didn't help.
But then it struck me:
In [6]: path = os.listdir('.')[-11]
In [7]: path
Out[7]: 'de_DE_\xf6_frami.aff'
In [8]: print path
de_DE_�_frami.aff
n [12]: "de_DE_ö_frami.aff" == 'de_DE_\xf6_frami.aff'
Out[12]: False
In [13]: os.listdir(u'.')[-11]
Out[13]: 'de_DE_\xf6_frami.aff'
In [14]: "de_DE_ö_frami.aff"
Out[14]: 'de_DE_\xc3\xb6_frami.aff'
So this seems like python's os.listdir reports the filename incorrectly! I
experimented with cyrillic file names and found no problems
In [8]: os.listdir('.')[-8]
Out[8]: '\xd0\xbf\xd1\x80\xd0\xb8\xd0\xb2\xd0\xb5\xd1\x82'
In [9]: print '\xd0\xbf\xd1\x80\xd0\xb8\xd0\xb2\xd0\xb5\xd1\x82'
привет
In [10]: os.listdir(u'.')[-8]
Out[10]: u'\u043f\u0440\u0438\u0432\u0435\u0442'
In [12]: print os.listdir(u'.')[-8]
привет
Conclusion: we have a strange rare bug with python's os module scrambling
unicode filenames.
---
** [tickets:#7757] UnicodeDecodeError when generating code snapshot on hg repo**
**Status:** in-progress
**Milestone:** unreleased
**Labels:** support sf-current 42cc sf-1
**Created:** Fri Oct 10, 2014 03:14 PM UTC by Anonymous
**Last Updated:** Tue Jun 30, 2015 05:13 PM UTC
**Owner:** Igor Bondarenko
*Originally created by:* jwb1980
https://sourceforge.net/p/forge/site-support/8700/
----
[forge:site-support:#8700]
----
From IRC #sourceForge
download the source code of this project
https://sourceforge.net/p/nhunspell/code/ci/default/tree/
3:55 When I try the snapshot Sourceforge says "We're having trouble finding
that snapshot. Would you like to resubmit?"
3:55 TortoiseSVN gives me error 500 in my fork repository
----
---
Sent from forge-allura.apache.org because [email protected] is subscribed
to https://forge-allura.apache.org/p/allura/tickets/
To unsubscribe from further messages, a project admin can change settings at
https://forge-allura.apache.org/p/allura/admin/tickets/options. Or, if this is
a mailing list, you can unsubscribe from the mailing list.