Re: direction for documentation across various APIs that share common doc source

2019-03-12 Thread Haibin Lin
Hi Aaron,

You can see that the examples listed in elemwise_addDoc class in
https://github.com/apache/incubator-mxnet/blob/master/python/mxnet/ndarray_doc.py#L57
are appended to the example section of elemwise_add op in
http://mxnet.incubator.apache.org/api/python/ndarray/ndarray.html?highlight=reshape#mxnet.ndarray.elemwise_add


You can take a look at the _build_doc function in ndarray_doc.py which
contains the logic to append examples from xxDoc classes. The
_build_doc function is called in _generate_ndarray_function_code when these
python functions are generated:
https://github.com/apache/incubator-mxnet/blob/e3a51b5a3ed989bf1e9c9f53b56819b32957527f/python/mxnet/ndarray/register.py#L54-L60


Best,
Haibin


On Wed, Mar 6, 2019 at 11:57 AM Aaron Markham 
wrote:

> Mu,
> Thanks for your response. I have some follow-up questions now. A lot
> actually.
> Can you explain more about what ndarray_doc.py is doing? I see that
> ndarray.register is calling it to do some transformations to
> docstrings by injecting "float". This seems quite buried to me. Some
> may have wondered tracing through the ndarray docs, "where does float
> come from? It's not listed here in the docstring. Strange."
> Is there a document that describes this pattern so other developers
> know how to use it and what impact it has? Why am I not seeing the
> pattern other than ndarray and symbol and only for float support? Why
> doesn't symbol have a correlating symbol_doc.py? This makes me wonder
> about the various issues I've seen where functions are properly
> described, or at all, when Sphinx runs. Is this something that should
> have been applied more widely, but has not? Wouldn't it make sense to
> have the docs massaging processes centralized for maintenance and
> clarity?
>
> Aside from that, I'm still not seeing the path for solving the issue
> with R and Scala and Java showing psuedocode or python code in their
> examples by using `make doctest`. Maybe they're first steps to make
> sure Python examples execute, but don't extend a solution to any other
> language binding? That's fine if so, but I still want to keep
> exploring what we do to facilitate good docs for the other bindings.
> Are you perhaps suggesting that each language binding follow this
> rewriting of the docstrings pattern that's in ndarray and symbol?
>
> Can you look at this PR and provide feedback on a tangible example of
> how to proceed? https://github.com/apache/incubator-mxnet/pull/14243
>
> 
>
> Vishaal & Anton, thanks for your feedback too. Flagging the code makes
> a lot of sense as then it would be quite apparent what its intended
> language is. Rerunning sphinx to rewrite the output could work, and
> that assumes those packages have something specific and relevant to
> inject. Unfortunately for R, Scala, and Java, that doesn't seem to be
> the case as this point. Please correct me if I'm missing something
> here.
>
> 
>
> Cheers,
> Aaron
>
> On Tue, Mar 5, 2019 at 9:44 AM Mu Li  wrote:
> >
> > The original design is putting psudo-code in cc files  (e.g. ndarray.cc
> > <
> https://github.com/apache/incubator-mxnet/blob/master/src/ndarray/ndarray.cc
> >)
> > that are languange indepent, then having python codes in .py files (e.g.
> > ndarray_doc.py
> > <
> https://github.com/apache/incubator-mxnet/blob/master/python/mxnet/ndarray_doc.py
> >).
> > However, we haven't define the psudo-code format so some codes in cc
> files
> > look like python, and we didn't enable doctest so some py file codes
> cannot
> > be executed.
> >
> > I suggest the following next steps:
> >
> > 1. follow tensorflow's psudo-code format, e.g.
> > https://www.tensorflow.org/versions/r2.0/api_docs/python/tf/fill
> > 2. enable doctest during building the doc (make doctest)
> >
> > On Mon, Mar 4, 2019 at 10:09 AM Vishaal Kapoor 
> > wrote:
> >
> > > Hey Aaron and  Anton,
> > >
> > > One of MXNet's strengths over other frameworks is the plethora of
> language
> > > bindings so having language specific examples is of importance. Perhaps
> > > indicating that an example is Python code by using a "#python" header
> on
> > > the example would make it clear.  Of course, for the important APIs,
> > > docstrings for the most popular languages would be desired.
> Additionally,
> > > making the holes clear would make it easier for users to contribute
> > > documentation for their favorite languages.
> > >
> > > Vishaal
> > >
> > > On Mon, Mar 4, 2019 at 8:34 AM Anton Chernov 
> wrote:
> > >
> > > > Hi Aaron,
> > > >
> > > > Here is an idea: The main documentation is the one in .cc files. In
> > > theory
> > > > the language bindings should just override some stuff from it, like
> > > > examples. If I understand correctly there is a sphinx script that
> > > generates
> > > > the documentation. If run it first for core src folder and then from
> a
> > > > language 

Re: direction for documentation across various APIs that share common doc source

2019-03-06 Thread Aaron Markham
Mu,
Thanks for your response. I have some follow-up questions now. A lot actually.
Can you explain more about what ndarray_doc.py is doing? I see that
ndarray.register is calling it to do some transformations to
docstrings by injecting "float". This seems quite buried to me. Some
may have wondered tracing through the ndarray docs, "where does float
come from? It's not listed here in the docstring. Strange."
Is there a document that describes this pattern so other developers
know how to use it and what impact it has? Why am I not seeing the
pattern other than ndarray and symbol and only for float support? Why
doesn't symbol have a correlating symbol_doc.py? This makes me wonder
about the various issues I've seen where functions are properly
described, or at all, when Sphinx runs. Is this something that should
have been applied more widely, but has not? Wouldn't it make sense to
have the docs massaging processes centralized for maintenance and
clarity?

Aside from that, I'm still not seeing the path for solving the issue
with R and Scala and Java showing psuedocode or python code in their
examples by using `make doctest`. Maybe they're first steps to make
sure Python examples execute, but don't extend a solution to any other
language binding? That's fine if so, but I still want to keep
exploring what we do to facilitate good docs for the other bindings.
Are you perhaps suggesting that each language binding follow this
rewriting of the docstrings pattern that's in ndarray and symbol?

Can you look at this PR and provide feedback on a tangible example of
how to proceed? https://github.com/apache/incubator-mxnet/pull/14243



Vishaal & Anton, thanks for your feedback too. Flagging the code makes
a lot of sense as then it would be quite apparent what its intended
language is. Rerunning sphinx to rewrite the output could work, and
that assumes those packages have something specific and relevant to
inject. Unfortunately for R, Scala, and Java, that doesn't seem to be
the case as this point. Please correct me if I'm missing something
here.



Cheers,
Aaron

On Tue, Mar 5, 2019 at 9:44 AM Mu Li  wrote:
>
> The original design is putting psudo-code in cc files  (e.g. ndarray.cc
> )
> that are languange indepent, then having python codes in .py files (e.g.
> ndarray_doc.py
> ).
> However, we haven't define the psudo-code format so some codes in cc files
> look like python, and we didn't enable doctest so some py file codes cannot
> be executed.
>
> I suggest the following next steps:
>
> 1. follow tensorflow's psudo-code format, e.g.
> https://www.tensorflow.org/versions/r2.0/api_docs/python/tf/fill
> 2. enable doctest during building the doc (make doctest)
>
> On Mon, Mar 4, 2019 at 10:09 AM Vishaal Kapoor 
> wrote:
>
> > Hey Aaron and  Anton,
> >
> > One of MXNet's strengths over other frameworks is the plethora of language
> > bindings so having language specific examples is of importance. Perhaps
> > indicating that an example is Python code by using a "#python" header on
> > the example would make it clear.  Of course, for the important APIs,
> > docstrings for the most popular languages would be desired. Additionally,
> > making the holes clear would make it easier for users to contribute
> > documentation for their favorite languages.
> >
> > Vishaal
> >
> > On Mon, Mar 4, 2019 at 8:34 AM Anton Chernov  wrote:
> >
> > > Hi Aaron,
> > >
> > > Here is an idea: The main documentation is the one in .cc files. In
> > theory
> > > the language bindings should just override some stuff from it, like
> > > examples. If I understand correctly there is a sphinx script that
> > generates
> > > the documentation. If run it first for core src folder and then from a
> > > language binding folder it could use the -f, --force flag [1] to override
> > > the needed parts. That would allow to provide a 'default' version of the
> > > documentation, that could be adjusted where needed.
> > >
> > > Best
> > > Anton
> > >
> > > [1]
> > >
> > >
> > http://www.sphinx-doc.org/en/stable/man/sphinx-apidoc.html#sphinx-apidoc-manual-page
> > >
> > > вт, 26 февр. 2019 г. в 02:20, Aaron Markham :
> > >
> > > > Hi everyone,
> > > > A recent issue and pending PR has brought a thorny docs situation to
> > > > my attention again and I'd like to hear from the community on how to
> > > > proceed.
> > > > We currently get some of the docs for the Python API pulled out of .cc
> > > > files. Other APIs also get docs from there, or pull the Python docs to
> > > > autogenerate their docs. This presents some problems:
> > > > 1. (Some of) The code examples provided don't run when you copy and
> > > > paste them. [1]
> > > > 2. The code examples that show up in other APIs won't work as the code
> > > > is Python and for (many/complicated) statements the syntax can be
> > > > wrong.
> > > >

Re: direction for documentation across various APIs that share common doc source

2019-03-05 Thread Mu Li
The original design is putting psudo-code in cc files  (e.g. ndarray.cc
)
that are languange indepent, then having python codes in .py files (e.g.
ndarray_doc.py
).
However, we haven't define the psudo-code format so some codes in cc files
look like python, and we didn't enable doctest so some py file codes cannot
be executed.

I suggest the following next steps:

1. follow tensorflow's psudo-code format, e.g.
https://www.tensorflow.org/versions/r2.0/api_docs/python/tf/fill
2. enable doctest during building the doc (make doctest)

On Mon, Mar 4, 2019 at 10:09 AM Vishaal Kapoor 
wrote:

> Hey Aaron and  Anton,
>
> One of MXNet's strengths over other frameworks is the plethora of language
> bindings so having language specific examples is of importance. Perhaps
> indicating that an example is Python code by using a "#python" header on
> the example would make it clear.  Of course, for the important APIs,
> docstrings for the most popular languages would be desired. Additionally,
> making the holes clear would make it easier for users to contribute
> documentation for their favorite languages.
>
> Vishaal
>
> On Mon, Mar 4, 2019 at 8:34 AM Anton Chernov  wrote:
>
> > Hi Aaron,
> >
> > Here is an idea: The main documentation is the one in .cc files. In
> theory
> > the language bindings should just override some stuff from it, like
> > examples. If I understand correctly there is a sphinx script that
> generates
> > the documentation. If run it first for core src folder and then from a
> > language binding folder it could use the -f, --force flag [1] to override
> > the needed parts. That would allow to provide a 'default' version of the
> > documentation, that could be adjusted where needed.
> >
> > Best
> > Anton
> >
> > [1]
> >
> >
> http://www.sphinx-doc.org/en/stable/man/sphinx-apidoc.html#sphinx-apidoc-manual-page
> >
> > вт, 26 февр. 2019 г. в 02:20, Aaron Markham :
> >
> > > Hi everyone,
> > > A recent issue and pending PR has brought a thorny docs situation to
> > > my attention again and I'd like to hear from the community on how to
> > > proceed.
> > > We currently get some of the docs for the Python API pulled out of .cc
> > > files. Other APIs also get docs from there, or pull the Python docs to
> > > autogenerate their docs. This presents some problems:
> > > 1. (Some of) The code examples provided don't run when you copy and
> > > paste them. [1]
> > > 2. The code examples that show up in other APIs won't work as the code
> > > is Python and for (many/complicated) statements the syntax can be
> > > wrong.
> > >
> > > When I try out something new and go for the hello world example or
> > > browse around I do expect the docs' code examples to work. If they
> > > don't, well, that's a bad sign and I move on to another project. I'd
> > > like for new users to have a great experience no matter what language
> > > they use.
> > >
> > > One fix is to go ahead a be "Python 1st" and make sure the code
> > > executes. This route is proposed in a PR for some NDArray operators.
> > > [2] As I mention in the PR comments, this has the drawback of being
> > > very specific to Python and the psuedo-code, for what its worth,
> > > showing up in Scala docs (for example) will be much more obviously out
> > > of place. If I were an Scala person, I'd probably find this
> > > irritating. The same goes for R.
> > >
> > > So... what should we do? Here are some ideas:
> > > a) I thought about providing different examples in the .cc code, one
> > > for each language and then making sure those are parsed out properly
> > > when the APIs are generating their docs. I'm not sure how feasible
> > > this is.
> > > b) I thought that it would be nice if each operator had a wrapper for
> > > each language API, and this is where the example payload resides.
> > > Maybe docstrings go here too or the common docstrings just bubble up
> > > from the cc file. The benefit is that changes for a specific language
> > > remain in those packages and don't touch the shared core files.
> > > c) Another route is to keep the examples in the .cc files pseudo-code,
> > > but then also make sure each language has real examples in their docs.
> > > Then, any code block that's in the docs now that won't execute should
> > > be changed to a preformatted text block so people don't confuse it
> > > with functional code.
> > >
> > > I really don't like any of these options as they each sound like ton
> > > of work and difficult to maintain. Are there any projects that solve
> > > this problem in some elegant and efficient way?
> > >
> > > Cheers,
> > > Aaron
> > >
> > > [1] https://github.com/apache/incubator-mxnet/issues/14232
> > > [2] https://github.com/apache/incubator-mxnet/pull/14243
> > >
> >
>


Re: direction for documentation across various APIs that share common doc source

2019-03-04 Thread Vishaal Kapoor
Hey Aaron and  Anton,

One of MXNet's strengths over other frameworks is the plethora of language
bindings so having language specific examples is of importance. Perhaps
indicating that an example is Python code by using a "#python" header on
the example would make it clear.  Of course, for the important APIs,
docstrings for the most popular languages would be desired. Additionally,
making the holes clear would make it easier for users to contribute
documentation for their favorite languages.

Vishaal

On Mon, Mar 4, 2019 at 8:34 AM Anton Chernov  wrote:

> Hi Aaron,
>
> Here is an idea: The main documentation is the one in .cc files. In theory
> the language bindings should just override some stuff from it, like
> examples. If I understand correctly there is a sphinx script that generates
> the documentation. If run it first for core src folder and then from a
> language binding folder it could use the -f, --force flag [1] to override
> the needed parts. That would allow to provide a 'default' version of the
> documentation, that could be adjusted where needed.
>
> Best
> Anton
>
> [1]
>
> http://www.sphinx-doc.org/en/stable/man/sphinx-apidoc.html#sphinx-apidoc-manual-page
>
> вт, 26 февр. 2019 г. в 02:20, Aaron Markham :
>
> > Hi everyone,
> > A recent issue and pending PR has brought a thorny docs situation to
> > my attention again and I'd like to hear from the community on how to
> > proceed.
> > We currently get some of the docs for the Python API pulled out of .cc
> > files. Other APIs also get docs from there, or pull the Python docs to
> > autogenerate their docs. This presents some problems:
> > 1. (Some of) The code examples provided don't run when you copy and
> > paste them. [1]
> > 2. The code examples that show up in other APIs won't work as the code
> > is Python and for (many/complicated) statements the syntax can be
> > wrong.
> >
> > When I try out something new and go for the hello world example or
> > browse around I do expect the docs' code examples to work. If they
> > don't, well, that's a bad sign and I move on to another project. I'd
> > like for new users to have a great experience no matter what language
> > they use.
> >
> > One fix is to go ahead a be "Python 1st" and make sure the code
> > executes. This route is proposed in a PR for some NDArray operators.
> > [2] As I mention in the PR comments, this has the drawback of being
> > very specific to Python and the psuedo-code, for what its worth,
> > showing up in Scala docs (for example) will be much more obviously out
> > of place. If I were an Scala person, I'd probably find this
> > irritating. The same goes for R.
> >
> > So... what should we do? Here are some ideas:
> > a) I thought about providing different examples in the .cc code, one
> > for each language and then making sure those are parsed out properly
> > when the APIs are generating their docs. I'm not sure how feasible
> > this is.
> > b) I thought that it would be nice if each operator had a wrapper for
> > each language API, and this is where the example payload resides.
> > Maybe docstrings go here too or the common docstrings just bubble up
> > from the cc file. The benefit is that changes for a specific language
> > remain in those packages and don't touch the shared core files.
> > c) Another route is to keep the examples in the .cc files pseudo-code,
> > but then also make sure each language has real examples in their docs.
> > Then, any code block that's in the docs now that won't execute should
> > be changed to a preformatted text block so people don't confuse it
> > with functional code.
> >
> > I really don't like any of these options as they each sound like ton
> > of work and difficult to maintain. Are there any projects that solve
> > this problem in some elegant and efficient way?
> >
> > Cheers,
> > Aaron
> >
> > [1] https://github.com/apache/incubator-mxnet/issues/14232
> > [2] https://github.com/apache/incubator-mxnet/pull/14243
> >
>


Re: direction for documentation across various APIs that share common doc source

2019-03-04 Thread Anton Chernov
Hi Aaron,

Here is an idea: The main documentation is the one in .cc files. In theory
the language bindings should just override some stuff from it, like
examples. If I understand correctly there is a sphinx script that generates
the documentation. If run it first for core src folder and then from a
language binding folder it could use the -f, --force flag [1] to override
the needed parts. That would allow to provide a 'default' version of the
documentation, that could be adjusted where needed.

Best
Anton

[1]
http://www.sphinx-doc.org/en/stable/man/sphinx-apidoc.html#sphinx-apidoc-manual-page

вт, 26 февр. 2019 г. в 02:20, Aaron Markham :

> Hi everyone,
> A recent issue and pending PR has brought a thorny docs situation to
> my attention again and I'd like to hear from the community on how to
> proceed.
> We currently get some of the docs for the Python API pulled out of .cc
> files. Other APIs also get docs from there, or pull the Python docs to
> autogenerate their docs. This presents some problems:
> 1. (Some of) The code examples provided don't run when you copy and
> paste them. [1]
> 2. The code examples that show up in other APIs won't work as the code
> is Python and for (many/complicated) statements the syntax can be
> wrong.
>
> When I try out something new and go for the hello world example or
> browse around I do expect the docs' code examples to work. If they
> don't, well, that's a bad sign and I move on to another project. I'd
> like for new users to have a great experience no matter what language
> they use.
>
> One fix is to go ahead a be "Python 1st" and make sure the code
> executes. This route is proposed in a PR for some NDArray operators.
> [2] As I mention in the PR comments, this has the drawback of being
> very specific to Python and the psuedo-code, for what its worth,
> showing up in Scala docs (for example) will be much more obviously out
> of place. If I were an Scala person, I'd probably find this
> irritating. The same goes for R.
>
> So... what should we do? Here are some ideas:
> a) I thought about providing different examples in the .cc code, one
> for each language and then making sure those are parsed out properly
> when the APIs are generating their docs. I'm not sure how feasible
> this is.
> b) I thought that it would be nice if each operator had a wrapper for
> each language API, and this is where the example payload resides.
> Maybe docstrings go here too or the common docstrings just bubble up
> from the cc file. The benefit is that changes for a specific language
> remain in those packages and don't touch the shared core files.
> c) Another route is to keep the examples in the .cc files pseudo-code,
> but then also make sure each language has real examples in their docs.
> Then, any code block that's in the docs now that won't execute should
> be changed to a preformatted text block so people don't confuse it
> with functional code.
>
> I really don't like any of these options as they each sound like ton
> of work and difficult to maintain. Are there any projects that solve
> this problem in some elegant and efficient way?
>
> Cheers,
> Aaron
>
> [1] https://github.com/apache/incubator-mxnet/issues/14232
> [2] https://github.com/apache/incubator-mxnet/pull/14243
>


direction for documentation across various APIs that share common doc source

2019-02-25 Thread Aaron Markham
Hi everyone,
A recent issue and pending PR has brought a thorny docs situation to
my attention again and I'd like to hear from the community on how to
proceed.
We currently get some of the docs for the Python API pulled out of .cc
files. Other APIs also get docs from there, or pull the Python docs to
autogenerate their docs. This presents some problems:
1. (Some of) The code examples provided don't run when you copy and
paste them. [1]
2. The code examples that show up in other APIs won't work as the code
is Python and for (many/complicated) statements the syntax can be
wrong.

When I try out something new and go for the hello world example or
browse around I do expect the docs' code examples to work. If they
don't, well, that's a bad sign and I move on to another project. I'd
like for new users to have a great experience no matter what language
they use.

One fix is to go ahead a be "Python 1st" and make sure the code
executes. This route is proposed in a PR for some NDArray operators.
[2] As I mention in the PR comments, this has the drawback of being
very specific to Python and the psuedo-code, for what its worth,
showing up in Scala docs (for example) will be much more obviously out
of place. If I were an Scala person, I'd probably find this
irritating. The same goes for R.

So... what should we do? Here are some ideas:
a) I thought about providing different examples in the .cc code, one
for each language and then making sure those are parsed out properly
when the APIs are generating their docs. I'm not sure how feasible
this is.
b) I thought that it would be nice if each operator had a wrapper for
each language API, and this is where the example payload resides.
Maybe docstrings go here too or the common docstrings just bubble up
from the cc file. The benefit is that changes for a specific language
remain in those packages and don't touch the shared core files.
c) Another route is to keep the examples in the .cc files pseudo-code,
but then also make sure each language has real examples in their docs.
Then, any code block that's in the docs now that won't execute should
be changed to a preformatted text block so people don't confuse it
with functional code.

I really don't like any of these options as they each sound like ton
of work and difficult to maintain. Are there any projects that solve
this problem in some elegant and efficient way?

Cheers,
Aaron

[1] https://github.com/apache/incubator-mxnet/issues/14232
[2] https://github.com/apache/incubator-mxnet/pull/14243