On Thu, Dec 31, 2015 at 5:17 AM, John Russell <[email protected]> wrote: > > I would say there's a fair bit of decision-making and followup work having to > do with documentation.
Agreed. This doesn't need to be tied up with the podling report though, so I've changed the subject line to reflect that. > > For example, the current Impala docs that are embedded within the Cloudera > doc library cover a wide range of subjects: > > - "How to use Impala with <component XYZ>". For example, Impala with Sentry, > Impala with HBase, Impala with S3, Impala with Isilon... Some components are > Hadoop-based, others are more specific to what's shipped or integrated with > CDH. I feel like we should have a spreadsheet because these seem like > decisions to make on a case-by-case basis. > > - "How to do <task XYZ> with Impala". Performance tuning, troubleshooting, > deployment planning. Same kinds of considerations as the previous bullet. > Many of these aren't strictly part of core Impala features, rather they're > things that could have been delivered via blog posts, O'Reilly books, etc. > Again, there could be some amount of identifying / deciding / untangling to > produce the right subset to go in Apache-oriented docs. > > - "How to do <task XYZ> with Impala in Cloudera Manager". That seems like an > easy call to say, that kind of stuff doesn't get donated to Apache because > it's CDH-specific. That kind of content though is intermixed with "how to do > <task XYZ> _without_ Cloudera Manager" so it would be some work to untangle > instructions like that. > > - "CREATE TABLE" and similar language reference stuff. Doesn't every SQL > engine in the open source arena come with a language reference of one sort or > another... So I assume there has to be something either donated or created > from scratch along those lines. (Although my open source experience is with > MySQL, where the docs are under a more restrictive license than the software, > so I don't have exact precedents to go by.) > > Assuming that some amount of existing CDH doc is donated, then for purposes > of building, accepting contributions, etc. do we need to convert the content > to some particular format or use some specific build system? The doc content > that I'm talking about is currently in XML, with a DTD (DITA) that can be > built using an all-open-source toolchain. The format and toolchain might be > a little more heavyweight than on a lot of other Apache projects. There's no mandated documentation system for projects at Apache, so using DITA shouldn't be a problem, especially since it can be built using an open source toolchain, as you point out. Having some instructions on how to build the docs would be useful if they don't already exist. > > The main advantage of the current format for the Impala doc library is ease > of reuse. So there's the question of whether Apache-donated stuff doc like > language reference then _only_ exists in the context of the project site, or > gets reused within the doc library on cloudera.com. There are pros and cons > either way. Even if we centralize future docs on the impala.io site, so > there isn't a new instance corresponding to each new CDH x.y release, there > are still all the older instances of those pages from CDH 4.x, CDH 5.x, > Impala 1.x, and Impala 2.x docs on cloudera.com. > > I've been cogitating over these considerations the last few weeks, but no > approach has really jumped out at me as a slam dunk: > > a) Rip as much existing doc out of the Cloudera library as possible, convert > to the most contributor-friendly format, decouple entirely from the CDH > library? > b) Donate core Impala feature docs only, keep the XML format the same, > encourage verbatim reuse of doc content across CDH and other distributions > that include Impala? I would vote for a combination of a and b - donate all non-CDH-specific doc, and keep the existing XML format (DITA). Cheers, Tom > c) Some middle ground? For example, it would be possible to mix and match > the current XML doc format with user-contributed content in Markdown format. > > Thanks, > John > > > On Dec 30, 2015, at 3:07 PM, Henry Robinson <[email protected]> wrote: > > > > Hi all - > > > > Here's a draft of our inaugural podling report. Per the usual guidelines, > > Impala has to submit three monthly reports to the Incubator PPMC, after > > which we report every quarter. The purpose of the report is to expose the > > current state of the graduation effort to the Incubator, and to flag any > > problems that require Incubator attention. > > > > I hope this report also sheds a little light on what is needed to be done > > to move Impala's development in its entirety to the ASF and its > > infrastructure. We are looking forward to making quick progress on some of > > these items in 2016. > > > > If anyone has any further comments or edits they'd like to make, please > > respond to this thread. I am on a short timeline as I fly internationally > > tomorrow and will be out of contact for about ten days, so I plan to post > > this to the Incubator wiki tomorrow morning. Any edits can then be made > > there. > > > > Thanks, > > Henry > > > > -------------------- > > Impala > > > > Impala is a high-performance C++ and Java SQL query engine for data stored > > in > > Apache Hadoop-based clusters. > > > > Impala has been incubating since 2015-12-03. > > > > Three most important issues to address in the move towards graduation: > > > > 1. Resolve any issues around use of Gerrit as code-review tool. > > 2. Movement of existing JIRA / Git / wiki / e-mail resources to Apache > > equivalents > > 3. Initial release as incubating project. > > > > Any issues that the Incubator PMC (IPMC) or ASF Board wish/need to be > > aware of? > > > > None. > > > > How has the community developed since the last report? > > > > Slowly - Impala is still in the very early stages of incubation, and > > performing the mechanical tasks of code movement and infrastructure setup > > is our first priority. The holiday period in the United States has slowed > > this effort slightly, but we look forward to picking up pace in early 2016. > > There have been no additions to the committer or PMC lists since incubation > > began. > > > > How has the project developed since the last report? > > > > We have performed some of the basic initial tasks for incubation - > > establishing wiki pages, Git repositories and accounts for the initial > > committer set. Our next steps are: > > > > 1. Finalize the SGA from Cloudera > > 2. Move existing @cloudera.org e-mail aliases to their @ > > impala.incubator.apache.org equivalents. > > 3. Move source code from Cloudera git repository to Apache git repo. > > 4. Improve out-of-box build and test experience so that community can > > easily evaluate release artifacts. > > 5. Migrate cloudera.org JIRA tickets to issues.apache.org. > > > > > > Date of last release: > > > > NA > > > > When were the last committers or PMC members elected? > > > > At the time of the Incubation vote, 2015-12-03. >
