The Rust blog post is now live: https://arrow.apache.org/blog/2020/10/27/rust-2.0.0-release/
On Sun, Oct 18, 2020 at 12:46 PM Fernando Herrera < fernando.j.herr...@gmail.com> wrote: > Thanks Jorge for helping me to get across the need for a user guide. The > examples you used are exactly what I had in mind. It would be great if the > project had a user guide similar to tokio's. We could use this guide to > explain how to get started and some examples using the available crates > (Arrow, Arrow-Flight and Datafusion) > > I think that in order to start with the guide, I could take the approach > you suggested using either doc_comment or rust-skeptic to write down md > files to sketch the user guide. This guide could be included in each of the > project folders. e.g. arrow, flight and datafusion. Although, I'm a bit > inclined to have a single user guide (placed at the rust folder) that will > include a reference to all the elements that are included in the rust arrow > folder. This way the guide could be planned in such a way that a new user > would start learning about arrow arrays and finish doing queries on data > loaded from CVS or parquet files. > > Is this something that could interest you? > > Regards, > Fernando > > On Fri, Oct 16, 2020 at 10:32 PM Jorge Cardoso Leitão < > jorgecarlei...@gmail.com> wrote: > > > Hi, > > > > I would like to thank Fernando for raising this concern here: I also > think > > that we still do not put enough effort in the documentation :) I admit > that > > when I started in the project, I also had that need and just had some > time > > to go through the code. > > > > First, I find it useful to distinguish types of documentation: > > > > 1. API references > > 2. user guide / tutorial > > > > The distinction between the two being that the former covers detailed > usage > > of a given function/struct/trait etc, while the latter covers usage of > the > > library as a whole (e.g. how a function is used in combination with > > others). Virtually every > > <https://docs.djangoproject.com/en/3.1/intro/tutorial01/> mature > > <https://pandas.pydata.org/docs/user_guide/index.html> library > > <https://spark.apache.org/docs/latest/quick-start.html> or > > <https://www.tensorflow.org/tutorials/quickstart/beginner> framework > > <https://docs.python.org/3/tutorial/> has both, as they serve different > > but > > well defined use-cases. > > > > The canonical way of documenting an API in rust is via the `docs.rs`, > that > > generates a format common for rust projects and has a *significant* > benefit > > for both writers and readers (for Python users, auto-docs on steroids: > > auto-links to classes declarations, references, testing the examples are > > part of running the tests, the documentation is written next to the > actual > > source code, reference to the source code on a single click). Rust users > > expect the API documentation to be in `docs.rs` and released as part of > > crate. I agree with @Andy that we should stick to docs.rs for this. > While > > there is always room for improvement, we do have the basics in place. > > > > I think that Fernando is alluding to the fact that we do not have a user > > guide / tutorial, and I agree: we are missing one such as tokio > > <https://tokio.rs/tokio/tutorial>'s, SIMD > > <https://rust-lang.github.io/packed_simd/perf-guide/introduction.html > >'s, > > Rocket <https://rocket.rs/v0.4/guide/>'s or rust's book > > <https://doc.rust-lang.org/book/>, that covers how to use the > > library/framework. > > > > The main challenge is to ensure that the guide does not get deprecated. > > Looking at what other rust libs are doing, Serde, Tokio and Rocket write > > their guides in markdown and test the code on their guides (here: tokio > > <https://github.com/tokio-rs/website/tree/master/doc-test>, Rocket > > <https://github.com/SergioBenitez/Rocket/tree/v0.4/site/tests>). Rocket > > use > > their own codegen to test the docs, tokio uses doc_comment > > <https://docs.rs/doc-comment/0.3.3/doc_comment/> > > > > > The point of this (small) crate is to allow you to add doc comments > from > > macros or to test external markdown files' code blocks through rustdoc. > > > > and serde uses rust-skeptic <https://github.com/budziq/rust-skeptic>. > > > > Thus, one idea is to write the guide in markdown on each (arrow and > > datafusion) crate, run the examples there as part of the testing with > > doc_comment <https://docs.rs/doc-comment/0.3.3/doc_comment/> or > > rust-skeptic > > <https://github.com/budziq/rust-skeptic>, and include these on arrow's > > official documentation on build (we would need to depend on a third-party > > Sphinx extension < > https://www.sphinx-doc.org/en/master/usage/markdown.html > > > > > for this). > > > > This way, we keep the examples up-to-date, and the style and location > close > > to other implementation's documentation. > > > > Would this be an option? > > > > Best, > > Jorge > > > > > > > > On Sat, Oct 17, 2020 at 12:48 AM Fernando Herrera < > > fernando.j.herr...@gmail.com> wrote: > > > > > I understand the concern, especially with the project changing that > > > quickly. However, I haven't found a good material that I can use to > learn > > > how to use the crate. I know that each module has a lot of tests (which > > I'm > > > thankful for) but going from one test case to the other doesn't work > well > > > as learning material. It is a bit hard to find a starting point within > > the > > > project, especially if it's your first time seeing the code. Should one > > > start with the datatypes.rs or with the builder.rs? > > > > > > Also, I think it would help a lot to have a more relaxed approach (like > > > "learning rust with entirely too many lists") rather than a reference > > > approach (like the RTF). I see the RTF as something you use to find > > > references regarding the code, rather than a learning material I would > > use > > > to grasp what can be done with the crate. That's why I was suggesting a > > > book format, like the one that is used for Ballista. If you want a > > > reference material you can always have a look at the documentation > > created > > > within the crate. > > > > > > What do you think? > > > > > > @Andy Grove... is it possible to take part in your incoming > presentation? > > > > > > > > > On Fri, Oct 16, 2020 at 5:23 PM Micah Kornfield <emkornfi...@gmail.com > > > > > wrote: > > > > > > > > > > > > > We should be careful with the balance of content between the > > > Restructured > > > > > Text Format documentation and the documentation in the crate that > > gets > > > > > published to docs.rs though. The rustdoc documentation is > > unit-tested > > > to > > > > > ensure that it is always up to date and we will have to manually > > update > > > > the > > > > > RTF documentation for each release, and the project is still > evolving > > > > > rather quickly. > > > > > > > > > > > > If rust offers this out of the box then that definitely seems > > preferable. > > > > At some point it would be nice to enable doctest [1] for all of our > > > > snippets in the main repo. > > > > > > > > [1] > https://www.sphinx-doc.org/en/master/usage/extensions/doctest.html > > > > > > > > On Fri, Oct 16, 2020 at 3:17 PM Andy Grove <andygrov...@gmail.com> > > > wrote: > > > > > > > > > I think that it would be great to produce this kind of content. I'm > > > > giving > > > > > a presentation on Arrow to my local Rust meetup (virtually) next > week > > > and > > > > > these are similar to the topics I will be covering there. > > > > > > > > > > We should be careful with the balance of content between the > > > Restructured > > > > > Text Format documentation and the documentation in the crate that > > gets > > > > > published to docs.rs though. The rustdoc documentation is > > unit-tested > > > to > > > > > ensure that it is always up to date and we will have to manually > > update > > > > the > > > > > RTF documentation for each release, and the project is still > evolving > > > > > rather quickly. > > > > > > > > > > If the sample code included in RTF also exists as examples in the > > repo > > > > that > > > > > get tested then we can just copy and paste the contents over each > > time > > > we > > > > > release perhaps. > > > > > > > > > > Andy. > > > > > > > > > > > > > > > > > > > > On Fri, Oct 16, 2020 at 3:59 PM Micah Kornfield < > > emkornfi...@gmail.com > > > > > > > > > wrote: > > > > > > > > > > > Java and C++ have tutorials in Restructured Text Format in the > docs > > > > > folder > > > > > > [1]. I think creating something similar for Rust might be the > best > > > > place > > > > > > to start. These are rendered on the website. For example Java > is > > > > > located > > > > > > at [2]. > > > > > > > > > > > > > > > > > > [1] https://github.com/apache/arrow/tree/master/docs/source > > > > > > [2] https://arrow.apache.org/docs/java/index.html > > > > > > > > > > > > On Fri, Oct 16, 2020 at 2:48 PM Fernando Herrera < > > > > > > fernando.j.herr...@gmail.com> wrote: > > > > > > > > > > > > > I was working on the blog post I mentioned before regarding > Arrow > > > > usage > > > > > > > (rust) and how to use the different elements available in the > > > create. > > > > > > After > > > > > > > some thought, these were the topics I want to include: > > > > > > > > > > > > > > 1. Arrays examples and how they look like > > > > > > > Basic arrays and nested arrays > > > > > > > The buffer structure and how data is stored > > > > > > > Builders usage > > > > > > > Examples of complex arrays and how to construct them (using > > > > builders > > > > > > and > > > > > > > from) > > > > > > > 2. What is a record batch? > > > > > > > How to construct a record batch > > > > > > > How a RecordBatch is used with IPC > > > > > > > 3. How to read files? > > > > > > > CSV files and Parquet files > > > > > > > 4. How to share information > > > > > > > What is Arrow flight? > > > > > > > How to set up a server with Rust > > > > > > > Examples > > > > > > > 5. How to query information from arrays? > > > > > > > Datafusion examples > > > > > > > > > > > > > > However, as I was working on the examples > > > > > > > < > > > > https://github.com/elferherrera/test_example/blob/master/src/main.rs > > > > > > > > > that > > > > > > > I was planning to use (most of them came from the Arrow > > > repository) I > > > > > > > thought that the best format would be a book, something similar > > to > > > > the > > > > > > Rust > > > > > > > book. I think this format will help us to fully explain how > each > > > > > > > constructor can be used in detail and how each of the data > arrays > > > can > > > > > be > > > > > > > used and manipulated. > > > > > > > > > > > > > > What do you think about it? > > > > > > > > > > > > > > I could start the book using the examples in the repository and > > the > > > > > tests > > > > > > > done as a base. However, I cannot find a quick tutorial on > > setting > > > > up a > > > > > > > book like that, let alone how to host it. I know it has to be > > made > > > > > using > > > > > > > .md files, but that's as far as I have got. Can somebody give > me > > a > > > > > > pointer > > > > > > > on setting up something like that? > > > > > > > > > > > > > > Regards > > > > > > > > > > > > > > On Thu, Oct 15, 2020 at 3:18 PM Mark Farnan < > m...@markfarnan.com > > > > > > > > wrote: > > > > > > > > > > > > > > > I would agree with this. > > > > > > > > > > > > > > > > I’ve been working with the GO Arrow library last few weeks, > and > > > > took > > > > > a > > > > > > > > while to get head around it all / how to use etc. > > > > > > > > Even then not sure i’ve got it right. > > > > > > > > > > > > > > > > Usage examples would be great. > > > > > > > > > > > > > > > > Regards > > > > > > > > > > > > > > > > Mark > > > > > > > > > > > > > > > > > On Oct 14, 2020, at 4:08 PM, Fernando Herrera < > > > > > > > > fernando.j.herr...@gmail.com> wrote: > > > > > > > > > > > > > > > > > > I was wondering if besides this blog post there should be > > > another > > > > > on > > > > > > > with > > > > > > > > > an example of usage. I think that is one of the key things > > > > missing > > > > > > for > > > > > > > > > Arrow in general. This example should show the problems > that > > > > Arrow > > > > > is > > > > > > > > > solving and how to implement the solution in real life. > > > > > > > > > > > > > > > > > > On Tue, Oct 13, 2020 at 10:12 AM Andy Grove < > > > > andygrov...@gmail.com > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > >> There has been a huge amount of activity in the Rust > > > subproject > > > > > for > > > > > > > the > > > > > > > > >> 2.0.0 release and I think that we should write a > > Rust-specific > > > > > blog > > > > > > > > post to > > > > > > > > >> go on the Arrow blog. > > > > > > > > >> > > > > > > > > >> I made a brief start at a Google doc, which is mostly just > > > > bullet > > > > > > > points > > > > > > > > >> listing some things we could talk about. I'm sure I've > > missed > > > > some > > > > > > > > things, > > > > > > > > >> and maybe we have too many things to talk about so we > might > > > want > > > > > to > > > > > > > try > > > > > > > > and > > > > > > > > >> summarize some of this. > > > > > > > > >> > > > > > > > > >> Here is the doc ... I would appreciate any help anyone can > > > > provide > > > > > > > with > > > > > > > > >> this. Perhaps if each contributor could flesh out the > > content > > > > > around > > > > > > > > things > > > > > > > > >> they directly worked on or are knowledgeable about, that > > would > > > > be > > > > > > > great. > > > > > > > > >> > > > > > > > > >> > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://docs.google.com/document/d/1RY7oa7ldi4RnyFzk3_5NHiiQl7IcvZgXFq3FYr5iwFc/edit?usp=sharing > > > > > > > > >> > > > > > > > > >> Thanks, > > > > > > > > >> > > > > > > > > >> Andy. > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >