Re: [Numpy-discussion] Internationalization of numpy/scipy docstrings...

2012-05-22 Thread Tim Cera


 Docstrings are not stored in .rst files but in the numpy sources, so
 there are some non-trivial technical and workflow details missing here. But
 besides that, I think translating everything (even into a single language)
 is a massive amount of work, and it's not at all clear if there's enough
 people willing to help out with this. So I'd think it would be better to
 start with just the high-level docs (numpy user guide, scipy tutorial) to
 see how it goes.


I understand that this is non-trivial, for me anyway, because I can't
figure out how to make my way around numpydoc, and documentation editor
code (not quite true, as Pauli accepted a couple of my pull requests, but
I definitely can't make it dance).  This is why I asked for interest and
help on the mailing list.  I think for the people that worked on the
documentation editor, or know Django, or are cleverer than I, the required
changes to the documentation editor might by mid-trivial.  That is my hope
anyway.

Would probably have the high-level docs separate from the docstring
processing anyway since the high-level docs are already in a sphinx source
directory.  So I agree that the high-level docs would be the best place to
start and in-fact that is what I was working with and found the problem
with the sphinx gettext builder mentioned in the original post.

I do want to defend and clarify the docstring processing though.
 Docstrings, in the code, will always be English. The documentation editor
is the fulcrum.  The documentation editor will work with the in the code
docstrings *exactly *as it does now.  The documentation editor would be
changed so that when it writes the ReST formatted docstring back into the
code, it *also *writes a *.rst file to a separate sphinx source directory.
 These *.rst files would not be part of the numpy source code directory,
but an interim file for the documentation editor and sphinx to extract
strings to make *.po files, pootle + hordes of translators :-) gives *.pot
files, *.pot - *.mo - *.rst (translated).  The English *.rst, *.po,
*.pot, *.mo files are all interim products behind the scenes.  The
translated *.rst files would NOT be part of the numpy source code, but
packaged separately.

I must admit that I did hope that there would be more interest.  Maybe I
should have figured out how to put 'maskna' or '1.7' in the subject?

In defense of there not be much interest is that the people who would
possibly benefit, aren't reading English mailing lists.

Kindest regards,
Tim
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Internationalization of numpy/scipy docstrings...

2012-05-22 Thread josef . pktd
On Tue, May 22, 2012 at 10:51 AM, Tim Cera t...@cerazone.net wrote:

 Docstrings are not stored in .rst files but in the numpy sources, so there
 are some non-trivial technical and workflow details missing here. But
 besides that, I think translating everything (even into a single language)
 is a massive amount of work, and it's not at all clear if there's enough
 people willing to help out with this. So I'd think it would be better to
 start with just the high-level docs (numpy user guide, scipy tutorial) to
 see how it goes.


 I understand that this is non-trivial, for me anyway, because I can't figure
 out how to make my way around numpydoc, and documentation editor code (not
 quite true, as Pauli accepted a couple of my pull requests, but
 I definitely can't make it dance).  This is why I asked for interest and
 help on the mailing list.  I think for the people that worked on the
 documentation editor, or know Django, or are cleverer than I, the required
 changes to the documentation editor might by mid-trivial.  That is my hope
 anyway.

 Would probably have the high-level docs separate from the docstring
 processing anyway since the high-level docs are already in a sphinx source
 directory.  So I agree that the high-level docs would be the best place to
 start and in-fact that is what I was working with and found the problem with
 the sphinx gettext builder mentioned in the original post.

 I do want to defend and clarify the docstring processing though.
  Docstrings, in the code, will always be English. The documentation editor
 is the fulcrum.  The documentation editor will work with the in the code
 docstrings exactly as it does now.  The documentation editor would be
 changed so that when it writes the ReST formatted docstring back into the
 code, it also writes a *.rst file to a separate sphinx source directory.
  These *.rst files would not be part of the numpy source code directory, but
 an interim file for the documentation editor and sphinx to extract strings
 to make *.po files, pootle + hordes of translators :-) gives *.pot files,
 *.pot - *.mo - *.rst (translated).  The English *.rst, *.po, *.pot, *.mo
 files are all interim products behind the scenes.  The translated *.rst
 files would NOT be part of the numpy source code, but packaged separately.

 I must admit that I did hope that there would be more interest.  Maybe I
 should have figured out how to put 'maskna' or '1.7' in the subject?

 In defense of there not be much interest is that the people who would
 possibly benefit, aren't reading English mailing lists.

One advantage of getting this done would be that other packages could
follow the same approach.

Just as numpy.testing and numpy's doc standard has spread to related
packages, being able to generate translations might be even more
interesting to downstream packages. There the fraction of end users,
that are not used to working in english anyway, might be larger than
for numpy itself.

The numpy mailing list may be to narrow to catch the attention of
developers with enough interest and expertise in the area.

Josef


 Kindest regards,
 Tim

 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Internationalization of numpy/scipy docstrings...

2012-05-21 Thread Ralf Gommers
On Sun, May 20, 2012 at 11:59 PM, Tim Cera t...@cerazone.net wrote:

 Are you thinking only about documentation in .rst files (like the
 tutorials), or also the docstrings themselves? The former may be feasible,
 the latter I think will be difficult.


 Everything.  Within the documentation editor the RST docstrings are parsed
 from the functions, so instead of only storing them in the database for
 Django/doceditor to work with, can save them to *.rst files.

 I don't know how integrated we could/would make the documentation
 editor/sphinx/pootle combination, so I think the easiest would be
 integration through files.  Your question points out a detail (and some
 small refinements) that I should have put in the outline from my first
 message:

   0.5. As the pydocweb editor works on docstrings, up-to-date RST files
  are also saved to the file system, and triggers...

   1. The new gettext builder to convert *.rst to *.pot files.

   1.5. (OPTIONAL) Can make a preliminary, automatic translation.  Pootle

 currently supports Google Translate (now costs $) or Apertium.

   2. Translators would use pootle to edit the *.pot files to *.po files

   2.5. Use mgsfmt to create *.mo files

   3. From here can choose either:

   a. Use sphinx-build to create new,

  translated *.rst files from the *.mo files.

  (my favorite since we would have *.rst files)

   b. OR use gettext in Python to translate docstring

  on-the-fly from the *.mo files.



Docstrings are not stored in .rst files but in the numpy sources, so there
are some non-trivial technical and workflow details missing here. But
besides that, I think translating everything (even into a single language)
is a massive amount of work, and it's not at all clear if there's enough
people willing to help out with this. So I'd think it would be better to
start with just the high-level docs (numpy user guide, scipy tutorial) to
see how it goes.

Thinking about what languages to translate into would also make sense,
since having a bunch of partial translations lying around doesn't help
anyone. First thought: Spanish, Chinese.

Ralf



 At this point we would need to have an environment variable or other
 configuration mechanism to set the desired locale, which np.info would
 use to find the correct directory/rst file.  Lets just say for sake of my
 example that the configuration is handled by a np.locale function.


   np.info(np.array)

   # display English docstring as it currently does


   np.locale('fr')

   np.info(np.array)

   # display the French docstring


 Reference links:

 sphinx based translation

   http://sphinx.pocoo.org/latest/intl.html

   http://www.slideshare.net/lehmannro/sphinxi18n-the-true-story

 Pootle:

   http://translate.sourceforge.net/wiki/pootle/index

   (You have to get the development versions of translate and pootle to
 work with Django 1.4.)


 Kindest regards,

 Tim

 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Internationalization of numpy/scipy docstrings...

2012-05-21 Thread Nathaniel Smith
On Mon, May 21, 2012 at 10:44 PM, Ralf Gommers
ralf.gomm...@googlemail.com wrote:
 Thinking about what languages to translate into would also make sense, since
 having a bunch of partial translations lying around doesn't help anyone.
 First thought: Spanish, Chinese.

It's not like one can tell two translator volunteers to start speaking
the same language so as to better pool their efforts... they kind of
speak whatever they speak. But there is quite a bit of translator
volunteer person-power available out there across many languages. If
Tim gets the infrastructure worked out then advertising on some of the
big translation project mailing lists will probably get a lot of
eyeballs.

- N
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Internationalization of numpy/scipy docstrings...

2012-05-20 Thread Ralf Gommers
On Sun, May 20, 2012 at 12:04 AM, Tim Cera t...@cerazone.net wrote:

 I have thought for a long time that it would be nice to have numpy/scipy
 docs in multiple languages.  I didn't have any idea how to do it until I
 saw http://sphinx.pocoo.org/intl.html.  The gettext builder which is a
 requirement to make this happen is relatively new to sphinx.


 Outline of above applied to numpy/scipy...


 1. pydocweb would use the new gettext builder to convert *.rst to *.pot
 files.

 2. Translators would use pootle to edit the *.pot files to *.po files

pydocweb or pootle would use mgsfmt to create *.mo files

 3. From here can choose either:

 a. Have pydocweb use sphinx-build to create new,

translated *.rst files from the *.mo files.

(my favorite since we would have *.rst files)

 b. OR use gettext in Python to translate docstring

on-the-fly from the *.mo files.


 A user would then install a language kit, maybe something like scikits
 and access the translated docstring with a new 'np.info'.  As near as I
 can figure, Python 'help' command can't be replaced by something else, so
 'help' would always display the English docstring.


 I have pydocweb and pootle setup locally and working.  Ran into a problem
 though with sphinx-build creating the initial *.pot files. It seems to be a
 problem with numpydoc.  It fails on 'function' and 'auto*' directives.  I
 tried to look at numpydoc and it is a bit of very intense coding and I
 frankly have not been able to find my way around.


 I am willing to put in some work for this to happen. My block right now is
 getting the initial *.pot files.


 Any interest?


Are you thinking only about documentation in .rst files (like the
tutorials), or also the docstrings themselves? The former may be feasible,
the latter I think will be difficult.

Ralf
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Internationalization of numpy/scipy docstrings...

2012-05-20 Thread Tim Cera

 Are you thinking only about documentation in .rst files (like the
 tutorials), or also the docstrings themselves? The former may be feasible,
 the latter I think will be difficult.


Everything.  Within the documentation editor the RST docstrings are parsed
from the functions, so instead of only storing them in the database for
Django/doceditor to work with, can save them to *.rst files.

I don't know how integrated we could/would make the documentation
editor/sphinx/pootle combination, so I think the easiest would be
integration through files.  Your question points out a detail (and some
small refinements) that I should have put in the outline from my first
message:

  0.5. As the pydocweb editor works on docstrings, up-to-date RST files
 are also saved to the file system, and triggers...

  1. The new gettext builder to convert *.rst to *.pot files.

  1.5. (OPTIONAL) Can make a preliminary, automatic translation.  Pootle

currently supports Google Translate (now costs $) or Apertium.

  2. Translators would use pootle to edit the *.pot files to *.po files

  2.5. Use mgsfmt to create *.mo files

  3. From here can choose either:

  a. Use sphinx-build to create new,

 translated *.rst files from the *.mo files.

 (my favorite since we would have *.rst files)

  b. OR use gettext in Python to translate docstring

 on-the-fly from the *.mo files.


At this point we would need to have an environment variable or other
configuration mechanism to set the desired locale, which np.info would use
to find the correct directory/rst file.  Lets just say for sake of my
example that the configuration is handled by a np.locale function.


  np.info(np.array)

  # display English docstring as it currently does


  np.locale('fr')

  np.info(np.array)

  # display the French docstring


Reference links:

sphinx based translation

  http://sphinx.pocoo.org/latest/intl.html

  http://www.slideshare.net/lehmannro/sphinxi18n-the-true-story

Pootle:

  http://translate.sourceforge.net/wiki/pootle/index

  (You have to get the development versions of translate and pootle to work
with Django 1.4.)


Kindest regards,

Tim
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Internationalization of numpy/scipy docstrings...

2012-05-19 Thread Tim Cera
I have thought for a long time that it would be nice to have numpy/scipy
docs in multiple languages.  I didn't have any idea how to do it until I
saw http://sphinx.pocoo.org/intl.html.  The gettext builder which is a
requirement to make this happen is relatively new to sphinx.


Outline of above applied to numpy/scipy...


1. pydocweb would use the new gettext builder to convert *.rst to *.pot
files.

2. Translators would use pootle to edit the *.pot files to *.po files

   pydocweb or pootle would use mgsfmt to create *.mo files

3. From here can choose either:

a. Have pydocweb use sphinx-build to create new,

   translated *.rst files from the *.mo files.

   (my favorite since we would have *.rst files)

b. OR use gettext in Python to translate docstring

   on-the-fly from the *.mo files.


A user would then install a language kit, maybe something like scikits
and access the translated docstring with a new 'np.info'.  As near as I can
figure, Python 'help' command can't be replaced by something else, so
'help' would always display the English docstring.


I have pydocweb and pootle setup locally and working.  Ran into a problem
though with sphinx-build creating the initial *.pot files. It seems to be a
problem with numpydoc.  It fails on 'function' and 'auto*' directives.  I
tried to look at numpydoc and it is a bit of very intense coding and I
frankly have not been able to find my way around.


I am willing to put in some work for this to happen. My block right now is
getting the initial *.pot files.


Any interest?


You can see the problem directly by changing into the numpy/doc directory
and use the following command:

sphinx-build -b gettext -P source/ gettext/


Once sphinx-build is working, then the target build directory (which I
called 'gettext' above) would be in a location accessible to pootle.


Kindest regards,
Tim
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Internationalization of numpy/scipy docstrings...

2012-05-19 Thread Nathaniel Smith
On May 19, 2012 11:04 PM, Tim Cera t...@cerazone.net wrote:
 A user would then install a language kit, maybe something like scikits
and access the translated docstring with a new 'np.info'.  As near as I can
figure, Python 'help' command can't be replaced by something else, so
'help' would always display the English docstring.

help() just returns the __doc__ attribute, but a large number of numpy's
__doc__ attributes are set up by code at import time, so in principle even
these could be run through gettext pretty easily.

-n
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Internationalization of numpy/scipy docstrings...

2012-05-19 Thread Tim Cera
On Sat, May 19, 2012 at 8:16 PM, Nathaniel Smith n...@pobox.com wrote:

 help() just returns the __doc__ attribute, but a large number of numpy's
 __doc__ attributes are set up by code at import time, so in principle even
 these could be run through gettext pretty easily.

I didn't know that.  I suggested modifying np.info since I suspect that a
new np.info would be easier since changes to support i18n would be
contained to one command.  Of course if there is something easier/better,
let's go with that.

Kindest regards,
Tim
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion