On 17/10/12 04:32, Jens Getreu wrote:
> To follow up:
> 
> https://groups.google.com/d/topic/asciidoc/0p8l1qD8-40/discussion
> 
> here my question concerning python 3:
> When will AsciiDoc be compatible with Python 3? Any plans?
> What is the workload for an experienced programmer?

I thought I had posted my python3-port notes previously but can't find
the post, here they are (I don't have any time to do it at the moment
and imagine it could take some time):

NOTE: These notes are rough and have not been checked or verified.

= AsciiDoc Python 3 port

Read the 'String Changes in 3.0' in 'Learning Python 4th Ed.' first.

I haven't got beyond the planning stage, but here are the proposed
conventions going forward:

. UTF-8 is the default encoding (no change here).
. All configuration (.conf) files to be UTF-8 encoded (afaik all
  current .conf files are UTF-8).
. The AsciiDoc 'encoding' attribute sets the encoding of source
  files and output files (no change here).
. The setting of the 'encoding' attribute in AsciiDoc source documents
  is prohibited (you have to set it on the command-line or from
  configuration files).

In theory at least, the last rule (to avoid a Catch-22) would
introduce a backward incompatibility because currently the User Guide
states ``The 'encoding' attribute can be set using an AttributeEntry
inside the document header''. But this is broken anyway in that it
only applies to character sets that are backward compatible with ASCII
e.g.  ISO-8859-1 (latin-1).

Software should only work with Unicode strings internally, converting
to a particular encoding on output.

Port to Python 3 via 2.6, this is how Django are doing it:

``deprecate older 2.x releases until our minimum requirement is Python
2.6, then to take advantage of the compatibility features in 2.6 to
carry out the actual porting and achieve Python 3 support''

The idea will be to have a Python 2.6 version that can be
automatically converted to Python 3 using `2to3` with a '2to3' AAP
rule.

  2to3 -w -f idioms -f all a2x3.py
  2to3 -w -f idioms -f all -x next asciidoc3.py

Use `sys.version_info >= (3, 0)` to test for Python 3.

Need to replace all open() calls with:

  def file_open(filename, mode='r', encoding=None):
      if not encoding:
          encoding = document.attributes.get('encoding', 'UTF-8')
      return codecs.open(filename, mode, encoding, errors='strict')

. All AsciiDoc distribution text files are UTF-8 encoded.
. The 'encoding' attribute sets the encoding of input and output files
  (defaults to UTF-8).
. The use of the 'encoding' attribute in the document header is prohibited
  ???  unless the encoding of the header is compatible with UTF-8 e.g.
  ISO-8859-1 (latin-1)

What exactly is the encoding of text from stdin on Linux and Windows?
See:

- stdout encoding is set by the OS environment and is NOT
  sys.getdefaultencoding(), you can read it with sys.stdout.encoding
  but it can only be set externally (see

https://drj11.wordpress.com/2007/05/14/python-how-is-sysstdoutencoding-chosen/)
  Thankfully on Linux this is normally UTF-8.
  Things aren't so simple with Windows

(http://superuser.com/questions/239810/setting-utf8-as-default-character-encoding-in-windows-7).


Closing the points of entry:
. Reader to have 'encoding' attribute so includes get the right
  encoding.
. Text from asciidoc filter needs to be read with correct encoding.
. Text from `{sys:}` and `{eval:}` etal needs to be read with correct
  encoding.

Drop the 'newline' attribute -- it's just an unnecessary complication.

The by disabling binary file modes and encode() and decode() I was
able to get asciidoc3.py to work (search for ZZZ's in asciidoc3.py).
But need to rewrite the file
read/write with
http://docs.python.org/howto/unicode.html#reading-and-writing-unicode-data

See:
* http://docs.python.org/dev/howto/pyporting.html
* http://docs.python.org/whatsnew/2.6.html

Define a new intrinsic attribute (set by at in asciidoc.py) `{py3}`
which would be defined if we're using a Python 3 interpreter.
Could then be used to synthesise Python 3 filter names in conf
files e.g.

  filter='graphviz2png{py3?3}.py {verbose?-v} -o
"{outdir={indir}}/{imagesdir=}{imagesdir?/}{target}" -L {layout=dot} -F
{format=png} -'

* Use the python `-3` option to check code.
* Switch to Git? Google code supports Git, see also:
  https://code.google.com/p/support/wiki/ConvertingSvnToGit
  https://code.google.com/p/support/wiki/GitFAQ

I ran `2to3 asciidoc.py`, there are a couple of odd things I need
to check out:

See
http://diveintopython3.ep.io/porting-code-to-python-3-with-2to3.html#next

----
-            if Lex.next() is not Title:
+            if next(Lex) is not Title:

-        if len(self.next) <= self.READ_BUFFER_MIN:
+        if len(self.__next__) <= self.READ_BUFFER_MIN:
----


See
http://diveintopython3.ep.io/porting-code-to-python-3-with-2to3.html#dict
Most of these transforms are unnecessary as they're are just being
used for immediate iteration and nothing more.

----
-                for k in d.keys():
+                for k in list(d.keys()):
----

See
1. http://docs.python.org/py3k/library/2to3.html
// This next link is brillant!
2. http://diveintopython3.ep.io/porting-code-to-python-3-with-2to3.html


> 
> 
> -- 
> You received this message because you are subscribed to the Google
> Groups "asciidoc" group.
> To view this discussion on the web visit
> https://groups.google.com/d/msg/asciidoc/-/Njh0rK1Fsq0J.
> To post to this group, send email to [email protected].
> To unsubscribe from this group, send email to
> [email protected].
> For more options, visit this group at
> http://groups.google.com/group/asciidoc?hl=en.

-- 
You received this message because you are subscribed to the Google Groups 
"asciidoc" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/asciidoc?hl=en.

Reply via email to