On 08/06/2013 07:49, Νικόλαος Κούρας wrote:
Τη Σάββατο, 8 Ιουνίου 2013 5:52:22 π.μ. UTC+3, ο χρήστης Cameron Simpson έγραψε:
On 07Jun2013 11:52, =?utf-8?B?zp3Or866zr/PgiDOk866z4EzM866?= 
<nikos.gr...@gmail.com> wrote:

| ni...@superhost.gr [~/www/cgi-bin]# [Fri Jun 07 21:49:33 2013] [error] [client 
79.103.41.173]   File "/home/nikos/public_html/cgi-bin/files.py", line 81

| [Fri Jun 07 21:49:33 2013] [error] [client 79.103.41.173]     if( flag == 
'greek' )

| [Fri Jun 07 21:49:33 2013] [error] [client 79.103.41.173]                     
    ^

| [Fri Jun 07 21:49:33 2013] [error] [client 79.103.41.173] SyntaxError: 
invalid syntax

| [Fri Jun 07 21:49:33 2013] [error] [client 79.103.41.173] Premature end of 
script headers: files.py

| -------------------------------

| i dont know why that if statement errors.



Python statements that continue (if, while, try etc) end in a colon, so:

Oh iam very sorry.
Oh my God i cant beleive i missed a colon *again*:

I have corrected this:

#========================================================
# Collect filenames of the path dir as bytes
filename_bytes = os.listdir( b'/home/nikos/public_html/data/apps/' )

for filename in filename_bytes:
        # Compute 'path/to/filename' into bytes
        filepath_bytes = b'/home/nikos/public_html/data/apps/' + b'filename'
        flag = False
        
        try:
                # Assume current file is utf8 encoded
                filepath = filepath_bytes.decode('utf-8')
                flag = 'utf8'
        except UnicodeDecodeError:
                try:
                        # Since current filename is not utf8 encoded then it 
has to be greek-iso encoded
                        filepath = filepath_bytes.decode('iso-8859-7')
                        flag = 'greek'
                except UnicodeDecodeError:
                        print( '''I give up! File name is unreadable!''' )
        
        if flag == 'greek':
                # Rename filename from greek bytes --> utf-8 bytes
                os.rename( filepath_bytes, filepath.encode('utf-8') )
==================================

Now everythitng were supposed to work but instead iam getting this surrogate 
error once more.
What is this surrogate thing?

Since i make use of error cathcing and handling like 'except 
UnicodeDecodeError:'

then it utf8's decode fails for some reason, it should leave that file alone 
and try the next file?
        try:
                # Assume current file is utf8 encoded
                filepath = filepath_bytes.decode('utf-8')
                flag = 'utf8'
        except UnicodeDecodeError:

This is what it supposed to do, correct?

==================================
[Sat Jun 08 09:39:34 2013] [error] [client 79.103.41.173]   File 
"/home/nikos/public_html/cgi-bin/files.py", line 94, in <module>
[Sat Jun 08 09:39:34 2013] [error] [client 79.103.41.173]     
cur.execute('''SELECT url FROM files WHERE url = %s''', (filename,) )
[Sat Jun 08 09:39:34 2013] [error] [client 79.103.41.173]   File 
"/usr/local/lib/python3.3/site-packages/PyMySQL3-0.5-py3.3.egg/pymysql/cursors.py",
 line 108, in execute
[Sat Jun 08 09:39:34 2013] [error] [client 79.103.41.173]     query = 
query.encode(charset)
[Sat Jun 08 09:39:34 2013] [error] [client 79.103.41.173] UnicodeEncodeError: 
'utf-8' codec can't encode character '\\udcce' in position 35: surrogates not 
allowed

Look at the traceback.

It says that the exception was raised by:

    query = query.encode(charset)

which was called by:

    cur.execute('''SELECT url FROM files WHERE url = %s''', (filename,) )

But what is 'filename'? And what has it to do with the first code
snippet? Does the traceback have _anything_ to do with the first code
snippet?

--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to