Τη Παρασκευή, 7 Ιουνίου 2013 5:29:25 μ.μ. UTC+3, ο χρήστης MRAB έγραψε:
> This is a worse way of doing it because the ISO-8859-7 encoding has 1 > byte per codepoint, meaning that it's more 'tolerant' (if that's the > word) of errors. A sequence of bytes that is actually UTF-8 can be > decoded as ISO-8859-7, giving gibberish. > UTF-8 is less tolerant, and it's the encoding that ideally you should > be using everywhere, so it's better to assume UTF-8 and, if it fails, > try ISO-8859-7 and then rename so that any names that were ISO-8859-7 > will be converted to UTF-8. Indeed iw asnt aware of that, at that time, i was under the impression that if a string was encoded to bytes using soem charset can only be switched back with the use of that and only that charset. Since this is the case here is my fixning: #======================================================== # Collect filenames of the path dir as bytes filename_bytes = os.listdir( b'/home/nikos/public_html/data/apps/' ) for filename in filename_bytes: # Compute 'path/to/filename' into bytes filepath_bytes = b'/home/nikos/public_html/data/apps/' + b'filename' flag = False try: # Assume current file is utf8 encoded filepath = filepath_bytes.decode('utf-8') flag = 'utf8' except UnicodeDecodeError: try: # Since current filename is not utf8 encoded then it has to be greek-iso encoded filepath = filepath_bytes.decode('iso-8859-7') flag = 'greek' except UnicodeDecodeError: print( '''I give up! File name is unreadable!''' ) if( flag = 'greek' ) # Rename filename from greek bytes --> utf-8 bytes os.rename( filepath_bytes, filepath.encode('utf-8') ) #======================================================== filenames = os.listdir( '/home/nikos/public_html/data/apps/' ) # Load'em for filename in filenames: try: # Check the presence of a file against the database and insert if it doesn't exist cur.execute('''SELECT url FROM files WHERE url = %s''', filename ) data = cur.fetchone() if not data: # First time for file; primary key is automatic, hit is defaulted cur.execute('''INSERT INTO files (url, host, lastvisit) VALUES (%s, %s, %s)''', (filename, host, lastvisit) ) except pymysql.ProgrammingError as e: print( repr(e) ) #======================================================== filenames = os.listdir( '/home/nikos/public_html/data/apps/' ) filepaths = () # Build a set of 'path/to/filename' based on the objects of path dir for filename in filenames: filepaths.add( filename ) # Delete spurious cur.execute('''SELECT url FROM files''') data = cur.fetchall() # Check database's filenames against path's filenames for rec in data: if rec not in filepaths: cur.execute('''DELETE FROM files WHERE url = %s''', rec ) ============================= ni...@superhost.gr [~/www/cgi-bin]# [Fri Jun 07 21:49:33 2013] [error] [client 79.103.41.173] File "/home/nikos/public_html/cgi-bin/files.py", line 81 [Fri Jun 07 21:49:33 2013] [error] [client 79.103.41.173] if( flag == 'greek' ) [Fri Jun 07 21:49:33 2013] [error] [client 79.103.41.173] ^ [Fri Jun 07 21:49:33 2013] [error] [client 79.103.41.173] SyntaxError: invalid syntax [Fri Jun 07 21:49:33 2013] [error] [client 79.103.41.173] Premature end of script headers: files.py ------------------------------- i dont know why that if statement errors. -- http://mail.python.org/mailman/listinfo/python-list