Hi Lex

On 31/12/12 22:31, Lex Trotman wrote:
> Hi Stuart,
> 
> Having an end of year cleanout :)

Yep!

Thanks for these comments, you've obviously been a lot further along
this path than me. My problem with the Python 3 thing is that I can't
think of a single reason I (emphasis on "I") would want to other than
"gee, that would be nice", so I'm not planning to roll my sleeves up any
time soon on this one.

Cheers, Stuart


> 
> Couple of comments below for anyone who wants to try it.
> 
> On 31 December 2012 13:33, Stuart Rackham <[email protected]> wrote:
>>
>>
>> On 17/10/12 04:32, Jens Getreu wrote:
>>> To follow up:
>>>
>>> https://groups.google.com/d/topic/asciidoc/0p8l1qD8-40/discussion
>>>
>>> here my question concerning python 3:
>>> When will AsciiDoc be compatible with Python 3? Any plans?
>>> What is the workload for an experienced programmer?
>>
>> I thought I had posted my python3-port notes previously but can't find
>> the post, here they are (I don't have any time to do it at the moment
>> and imagine it could take some time):
>>
>> NOTE: These notes are rough and have not been checked or verified.
>>
>> = AsciiDoc Python 3 port
>>
>> Read the 'String Changes in 3.0' in 'Learning Python 4th Ed.' first.
>>
>> I haven't got beyond the planning stage, but here are the proposed
>> conventions going forward:
>>
>> . UTF-8 is the default encoding (no change here).
> 
> As was found on one regex bug, it is intended to be but isn't, Python3
> should be a good deal better in this respect.
> 
>> . All configuration (.conf) files to be UTF-8 encoded (afaik all
>>   current .conf files are UTF-8).
>> . The AsciiDoc 'encoding' attribute sets the encoding of source
>>   files and output files (no change here).
> 
> Distinction between source and output?  If source is cp1251 the output
> for HTML should still be UTF-8 IIUC.
> 
>> . The setting of the 'encoding' attribute in AsciiDoc source documents
>>   is prohibited (you have to set it on the command-line or from
>>   configuration files).
> 
> Thats error prone, how do I remember that file xyz is cp1251 and file
> zyx is UTF-8?  It should be in the file, similar to the encoding= in
> HTML.
> 
>>
>> In theory at least, the last rule (to avoid a Catch-22) would
>> introduce a backward incompatibility because currently the User Guide
>> states ``The 'encoding' attribute can be set using an AttributeEntry
>> inside the document header''. But this is broken anyway in that it
>> only applies to character sets that are backward compatible with ASCII
>> e.g.  ISO-8859-1 (latin-1).
> 
> So long as its compatible up to and including the :encoding: cp1251
> then it should be ok, and that encoding should be on the first line or
> two.
> 
>>
>> Software should only work with Unicode strings internally, converting
>> to a particular encoding on output.
>>
> 
> Thats the only way with Python3 IIUC all strings are Unicode code
> points, you have to explicitly use "bytes" objects for other
> behaviour.
> 
>> Port to Python 3 via 2.6, this is how Django are doing it:
>>
>> ``deprecate older 2.x releases until our minimum requirement is Python
>> 2.6, then to take advantage of the compatibility features in 2.6 to
>> carry out the actual porting and achieve Python 3 support''
> 
> I have found that this is an admirable target, but isn't always
> achievable, but maybe my Python programs are somewhat pathological
> anyway :)
> 
>>
>> The idea will be to have a Python 2.6 version that can be
>> automatically converted to Python 3 using `2to3` with a '2to3' AAP
>> rule.
>>
>>   2to3 -w -f idioms -f all a2x3.py
>>   2to3 -w -f idioms -f all -x next asciidoc3.py
>>
>> Use `sys.version_info >= (3, 0)` to test for Python 3.
>>
>> Need to replace all open() calls with:
>>
>>   def file_open(filename, mode='r', encoding=None):
>>       if not encoding:
>>           encoding = document.attributes.get('encoding', 'UTF-8')
>>       return codecs.open(filename, mode, encoding, errors='strict')
>>
>> . All AsciiDoc distribution text files are UTF-8 encoded.
>> . The 'encoding' attribute sets the encoding of input and output files
>>   (defaults to UTF-8).
>> . The use of the 'encoding' attribute in the document header is prohibited
>>   ???  unless the encoding of the header is compatible with UTF-8 e.g.
>>   ISO-8859-1 (latin-1)
>>
>> What exactly is the encoding of text from stdin on Linux and Windows?
>> See:
>>
>> - stdout encoding is set by the OS environment and is NOT
>>   sys.getdefaultencoding(), you can read it with sys.stdout.encoding
>>   but it can only be set externally (see
>>
>> https://drj11.wordpress.com/2007/05/14/python-how-is-sysstdoutencoding-chosen/)
>>   Thankfully on Linux this is normally UTF-8.
>>   Things aren't so simple with Windows
>>
>> (http://superuser.com/questions/239810/setting-utf8-as-default-character-encoding-in-windows-7).
>>
> 
> Yeah, its all kinda broken when using things via pipes :(
> 
>>
>> Closing the points of entry:
>> . Reader to have 'encoding' attribute so includes get the right
>>   encoding.
> 
> Assuming they are the same, maybe make that required!
> 
> 
> Cheers
> Lex
> 
> [...]
> 

-- 
You received this message because you are subscribed to the Google Groups 
"asciidoc" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/asciidoc?hl=en.

Reply via email to