String concatenate, format, comparison, indexing a dictionary*, all
work with mixed
types, unless the conversion from unicode to byte string can't be done with the
current codec when forced to bytestring, in which case you get a
suitable exception.

Besides, this wouldn't explain why the value of subdir_path is coming up u'http'
when his url is ...?subdir=public_html .


(Officially, attribute names, though used to do lookups in __dict__,
are not permitted
to have non-ASCII characters, according to the PEP that Karen often quotes.)

On Wed, Apr 7, 2010 at 2:56 PM, Daniel Roseman <dan...@roseman.org.uk> wrote:
> On Apr 7, 9:40 am, Alexey Vlasov <ren...@renton.name> wrote:
>> Hi.
>>
>> There's a simple code in urls.py:
>> ==============
>> def ls (request):
>>     import os
>>
>>     out_html = ''
>>     home_path = '/home/www/test-django'
>>     # subdir_path = request.GET.get ('subdir')
>>     subdir_path = 'public_html'
>>
>>     for root, dirs, files in os.walk (os.path.join (home_path, subdir_path)):
>>         out_html += "%s<br/>\n" % root
>>
>>     return HttpResponse (out_html)
>> ==============
>>
>> There's a catalogue in "home_path/subdir_path" which name
>> includes cyrillic symbols ( ):
>> $ pwd
>> /home/www/test-django/public_html
>> $ ls -la
>> drwx---r-x  4 test-django test-django  111 Apr  6 20:26 .
>> drwx--x--- 13 test-django test-django 4096 Apr  6 20:26 ..
>> -rw-r--r--  1 test-django test-django  201 Apr  6 17:43 .htaccess
>> -rwxr-xr-x  1 test-django test-django  911 Apr  6 16:38 index.fcgi
>> lrwxrwxrwx  1 test-django test-django   66 Mar 28 17:34 media -> ../
>> python/lib64/python2.5/site-packages/django/contrib/admin/media
>> drwxr-xr-x  2 test-django test-django    6 Apr  6 15:48
>>
>> My code works correct, here's the result:
>> $ curl -shttp://test-django.example.com/ls/
>> /home/www/test-django/public_html <br/>
>> /home/www/test-django/public_html/ <br/>
>>
>> But if I change "subdir_path = 'public_html'" to
>> "subdir_path = request.GET.get ('subdir')" then the request:
>> $ curl -shttp://test-django.example.com/ls/\?subdir=public_html
>> leads to an error:
>>
>> Request Method: GET
>> Request URL: http:// test-django.example.com/ls/
>> Django Version: 1.0.2 final
>> Python Version: 2.5.2
>> Installed Applications:
>> ['django.contrib.auth',
>>  'django.contrib.contenttypes',
>>  'django.contrib.sessions',
>>  'django.contrib.sites']
>> Installed Middleware:
>> ('django.middleware.common.CommonMiddleware',
>>  'django.contrib.sessions.middleware.SessionMiddleware',
>>  'django.contrib.auth.middleware.AuthenticationMiddleware')
>>
>> Traceback:
>> File "/home/www/test-django/python/lib64/python2.5/
>> site-packages/django/core/handlers/base.py" in get_response
>>   86.                 response = callback(request, *callback_args, 
>> **callback_kwargs)
>> File "/home/www/test-django/django/demo/urls.py" in ls
>>   40.     for root, dirs, files in os.walk (os.path.join (home_path, 
>> subdir_path)):
>> File "/usr/lib64/python2.5/os.py" in walk
>>   293.         if isdir(join(top, name)):
>> File "/usr/lib64/python2.5/posixpath.py" in isdir
>>   195.         st = os.stat(path)
>>
>> Exception Type: UnicodeEncodeError at /ls/
>> Exception Value: 'ascii' codec can't encode characters in position
>>  45-48: ordinal not in range(128)
>>
>> I don't understand it why "subdir_path" getting the same very value in one 
>> case works perfectly and in the
>> +other fails.
>>
>> Django runs following the instuctions
>> +http://docs.djangoproject.com/en/dev/howto/deployment/fastcgi/#runnin...
>> +h-apache
>>
>> --
>> BRGDS. Alexey Vlasov.
>
> I think I know the reason for the difference in hard-coding the string
> vs getting it from request.GET.
>
> Django always uses unicode strings internally, and this includes GET
> parameters. So your 'public_html' string is actually being converted
> to u'public_html', as you can see if you print the contents of
> request.GET. But your hard-coded string is a bytestring. If you used
> the unicode version - subdir_path = u'public_html'  - you would see
> the same result as with the GET version.
>
> As to why this is causing a problem when combined with os.walk and
> os.path.join, this is because of the rather strange behaviour of the
> functions in the os module. If you pass a unicode path parameter to
> them, they return results in unicode. But if you pass a bytestring
> parameter, the results are bytestrings. And since you have not
> declared a particular encoding, Python assumes it is ascii - and of
> course your Cyrillic filenames are not valid in ASCII.
>
> The problem should go away if you are careful to define *all* your
> strings as unicode - it is the mixture of unicode and bytestrings that
> is causing the problem. This means:
>    out_html = u''
>    home_path = u'/home/www/test-django'
>    ...
>        out_html += u"%s<br/>\n" % root
>
> --
> DR
>
> --
> You received this message because you are subscribed to the Google Groups 
> "Django users" group.
> To post to this group, send email to django-us...@googlegroups.com.
> To unsubscribe from this group, send email to 
> django-users+unsubscr...@googlegroups.com.
> For more options, visit this group at 
> http://groups.google.com/group/django-users?hl=en.
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To post to this group, send email to django-us...@googlegroups.com.
To unsubscribe from this group, send email to 
django-users+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-users?hl=en.

Reply via email to