On Fri, Aug 11, 2017 at 11:21 AM Keith Turner <[email protected]> wrote:
> On Wed, Aug 9, 2017 at 1:34 PM, Christopher <[email protected]> wrote: > > On Wed, Aug 9, 2017 at 10:41 AM Keith Turner <[email protected]> wrote: > > > >> On Tue, Aug 8, 2017 at 7:36 PM, Christopher <[email protected]> > wrote: > >> > On Tue, Aug 8, 2017 at 5:47 PM Keith Turner <[email protected]> wrote: > >> >> > >> >> For reference I used the following to collect stats > >> >> > >> >> > >> >> > >> >> > >> > https://github.com/issues?utf8=%E2%9C%93&q=is%3Aissue+repo%3Aapache%2Ffluo+repo%3Aapache%2Ffluo-recipes+repo%3Aapache%2Ffluo-website+closed%3A%3E%3D2017-05-01+ > >> >> > >> >> > >> > https://github.com/issues?utf8=%E2%9C%93&q=is%3Aissue+repo%3Aapache%2Ffluo+repo%3Aapache%2Ffluo-recipes+repo%3Aapache%2Ffluo-website+created%3A%3E%3D2017-05-01+ > >> >> > >> >> > >> > https://github.com/issues?utf8=%E2%9C%93&q=is%3Apr+repo%3Aapache%2Ffluo+repo%3Aapache%2Ffluo-recipes+repo%3Aapache%2Ffluo-website+closed%3A%3E%3D2017-05-01+ > >> >> > >> >> > >> > https://github.com/issues?utf8=%E2%9C%93&q=is%3Apr+repo%3Aapache%2Ffluo+repo%3Aapache%2Ffluo-recipes+repo%3Aapache%2Ffluo-website+created%3A%3E%3D2017-05-01+ > >> >> git log --since 2017-05-01 | grep Author | sort -u > >> >> > >> >> On Tue, Aug 8, 2017 at 5:46 PM, Keith Turner <[email protected]> > wrote: > >> >> > Here is a draft of the report > >> >> > > >> >> > ## Description: > >> >> > > >> >> > - Apache Fluo is an open source implementation of Percolator > (which > >> >> > populates > >> >> > Google's search index) for Apache Accumulo. With Fluo, users can > >> >> > continuously > >> >> > join new data into large existing data sets without reprocessing > >> all > >> >> > data. > >> >> > Unlike batch and streaming frameworks, Fluo offers much lower > >> latency > >> >> > and can > >> >> > operate on extremely large data sets. > >> > > >> > > >> > I'm not a big fan of this description. There's some elements of > debatable > >> > accuracy and relevancy. I suggest dropping the history/background and > >> > >> What do you think is inaccurate? > >> > >> > > Good feedback and discussion, definitely influenced what I put in the > report. I never responded because I was in a time crunch trying to > get the report out the door. I am going to open an issue to improve > the intro on the website based in this discussion. > > > It's not that it's inaccurate... it's that it's temporal. Saying that it > > offers "much lower latency" compared to others is not likely to be a > > resilient statement as advances in those other frameworks progress. It > may > > I think its reasonable to write things that one believes are presently > true. If there is a concern can always explicitly state its subject > to change. > > > be true... but it could not be true at any given moment. Same with "can > > operate on extremely large data sets"... it's not necessarily accurate to > > say this is "unlike" others, because others may be able to do that, too, > at > > any given moment. > > > > More importantly, I think, is whether or not these are relevant. Our docs > > can explain more details about history, background, prior technologies, > and > > comparisons with other things. Those things should not distract from > > answering "What is *this* in front of me?" question. I especially think > > My goal for contextualizing was to explain "what is this in front of > me" in relative terms. Sometimes its easier to understand a thing > relative to other things that you know. > > Fair enough. However, I prefer in absolute terms, rather than relative to other stuff, because I think it's better to not assume that the reader has any such relatable knowledge. > > this or the board report... the board isn't going to want to see an > > explanation of what Percolator is, or an explanation of limitations in > > other software... they just want a reminder of what this particular > Apache > > project is about. > > > > > >> > >> > comparison stuff, and using a more "get to the point" description > which > >> lets > >> > users know why they should care about Fluo, like: > >> > > >> > Apache Fluo is a distributed processing system which allows users to > >> > continuously and incrementally join new data with extremely large > >> existing > >> > data sets without reprocessing all the data. It provides low-latency, > >> > high-throughput data processing into Apache Accumulo, with cross-node > >> > transaction support and a notification system to automatically process > >> new > >> > data with a user-defined workflow. > >> > >> > >> Thinking about this some more base on your feedback, I think there are > >> three different aspects that are important to communicate to someone > >> trying to quickly decide if they should look into this further. > >> > >> 1 . What capabilities it offers to users > >> 2. How it works > >> 3. Context, how does it compare to other big data technologies. > >> > >> Below is a rough outline of an attempt to communicate these three > aspects. > >> > >> Intro : > >> > >> Apache Fluo is a distributed data processing system built on Apache > >> Accumulo. > >> > >> Capabilities : > >> > >> Fluo allows users to continuously join new data into large existing > >> data sets without reprocessing all data. With Fluo, users can keep > >> multiple dependent derived data sets up to date as new data arrives. > >> Changes to derived data sets can be emitted to external query or > >> analytics systems. > >> > >> How it works : > >> > >> Fluo achieves this by offering the ability to execute user defined > >> cross node transactions when data changes. > >> > >> Context : > >> > >> Fluo offers much lower latency than batch frameworks and can operate > >> on larger data sets than streaming frameworks. > >> > >> > > I like the way you broke it down, but I think the result is too long for > a > > description. I don't think the "Context" portion should be included. The > > "Capabilities" section reads like a follow-up paragraph, which begins > going > > into more details. > > (also, I like my wording better. ;) > > > > I do think the way you've broken it down would be great for the front > page > > of Fluo's website, though, in this sectioned format. But, the description > > should be just a "taste", so the smaller and more to the point it is, the > > better. > > > > In any case, for the purposes of this report, please select whatever > > description you think best. We can continue to discuss how best to > present > > Fluo to others through improved description, etc., separately from the > > board report. > > > > > >> > >> > > >> > > >> >> > >> >> > > >> >> > ## Issues: > >> >> > > >> >> > - There are no issues requiring board attention at this time. > >> >> > > >> >> > ## Activity: > >> >> > > >> >> > - Released versions of Fluo are tightly coupled with YARN+Twill > for > >> >> > launching > >> >> > services on a cluster. Work is currently under way to break > this > >> >> > tight > >> >> > coupling inorder to support YARN+Twill, Mesos, and Kubernetes. > >> >> > - Fluo implements an immutable byte array wrapper in its API. > Work > >> on > >> >> > moving > >> >> > this to its own sub-project is underway. The goal of this is to > >> >> > create > >> >> > something analogous to String for bytes that is suitable for > use in > >> >> > other APIs. > >> >> > This goal was discussed on an OpenJDK list and there was > agreement > >> >> > Java needs a > >> >> > project like this until Java defines a bigger story for > >> immutability. > >> >> > > >> > > >> > > >> > Could mention that we're still working through INFRA tasks for > graduation > >> > and are mostly transitioned. > >> > ( > >> > https://issues.apache.org/jira/servicedesk/agent/INFRA/issue/INFRA-14714) > >> > > >> > > >> >> > >> >> > ## Health report: > >> >> > > >> >> > - Within the past three months > >> >> > - 46 issues were opened and 23 closed > >> >> > - 64 pull request were opened and 56 were closed > >> >> > - 69 commits were made by 6 authors, 2 authors were not > committers > >> >> > > >> >> > Contributions from non-committers. GH stats. 35 PRs in last 3 > >> >> > months.. 31 closed. > >> >> > > >> >> > ## PMC changes: > >> >> > > >> >> > - Currently 8 PMC members. > >> >> > - Chris McTague was added to the PPMC on --- shortly before > >> graduation. > >> >> > > >> > > >> > > >> > Chris accepted membership on 2017-05-16, and had his ICLA filed and > >> account > >> > created on 2017-05-31. > >> > Should mention explicitly that this happened after our last/final > >> Incubator > >> > report, and not just before graduation. > >> > > >> >> > >> >> > ## Committer base changes: > >> >> > > >> >> > - Currently 8 committers. > >> >> > - Chris McTague was added as a committer --- shortly before > >> graduation. > >> >> > > >> > > >> > > >> > Would be good to clarify that PMC == committer, at least for now. This > >> is a > >> > common question from the board. > >> > > >> >> > >> >> > ## Releases: > >> >> > > >> >> > - fluo-1.1.0-incubating was released on Mon Jun 12 2017 > >> >> > - fluo-recipes-1.1.0-incubating was released on Thu Jun 22 2017 > >> >> > > >> >> > On Mon, Aug 7, 2017 at 5:19 PM, Christopher <[email protected]> > >> wrote: > >> >> >> I think INFRA tasks are done. > >> >> >> > >> >> >> On Wed, Aug 2, 2017 at 10:52 AM Keith Turner <[email protected]> > >> wrote: > >> >> >>> > >> >> >>> I started working on this and thinking about it a bit. > >> >> >>> > >> >> >>> I added our release dates at https://reporter.apache.org/ > >> >> >>> > >> >> >>> Currently thinking of mentioning the following that has happened > >> since > >> >> >>> our last report to IPMC. Thinking of linking to our last IPMC > >> report. > >> >> >>> > >> >> >>> * Our new member Chris > >> >> >>> * Our new releases > >> >> >>> * We are waiting on INFRA, link to issues... maybe this will > change > >> >> >>> before its due next Wed > >> >> >>> * Discussing the new work to support multiple ways to launch > Fluo > >> and > >> >> >>> deprecating our tight coupling to YARN in core. > >> >> >>> > >> >> >>> On Fri, Jul 28, 2017 at 7:59 AM, Brett Porter <[email protected]> > >> >> >>> wrote: > >> >> >>> > This email was sent on behalf of the ASF Board. It is an > initial > >> >> >>> > reminder to > >> >> >>> > give you plenty of time to prepare the report. > >> >> >>> > > >> >> >>> > According to board records, you are listed as the chair of a > >> >> >>> > committee > >> >> >>> > that is > >> >> >>> > due to submit a report this month. [1] [2] > >> >> >>> > > >> >> >>> > The meeting is scheduled for Wed, 16 Aug 2017 at 10:30 PDT and > the > >> >> >>> > deadline for > >> >> >>> > submitting your report is 1 full week prior to that (Wed Aug > 9th)! > >> >> >>> > > >> >> >>> > Meeting times in other time zones: > >> >> >>> > > >> >> >>> > > >> >> >>> > > >> >> >>> > > >> > http://www.timeanddate.com/worldclock/fixedtime.html?iso=2017-08-16T10:30:00&msg=ASF+Board+Meeting&p1=137 > >> >> >>> > > >> >> >>> > Please submit your report with sufficient time to allow the > board > >> >> >>> > members > >> >> >>> > to review and digest. Again, the very latest you should submit > >> your > >> >> >>> > report > >> >> >>> > is 1 full week (7days) prior to the board meeting (Wed Aug > 9th). > >> >> >>> > > >> >> >>> > If you feel that an error has been made, please consult [1] > and if > >> >> >>> > there > >> >> >>> > is still an issue then contact the board directly. > >> >> >>> > > >> >> >>> > As always, PMC chairs are welcome to attend the board meeting. > >> >> >>> > > >> >> >>> > Thanks, > >> >> >>> > The ASF Board > >> >> >>> > > >> >> >>> > [1] - > >> >> >>> > > >> >> >>> > > >> > https://svn.apache.org/repos/private/committers/board/committee-info.txt > >> >> >>> > [2] - > >> >> >>> > > >> https://svn.apache.org/repos/private/committers/board/calendar.txt > >> >> >>> > [3] - > >> >> >>> > > https://svn.apache.org/repos/private/committers/board/templates > >> >> >>> > [4] - https://reporter.apache.org/ > >> >> >>> > > >> >> >>> > > >> >> >>> > Submitting your Report > >> >> >>> > ---------------------- > >> >> >>> > > >> >> >>> > Full details about the process and schedule are in [1]. > >> >> >>> > > >> >> >>> > The report should be committed to the meeting agenda in the > board > >> >> >>> > directory > >> >> >>> > in the foundation repository, trying to keep a similar format > to > >> the > >> >> >>> > others. > >> >> >>> > This can be found at: > >> >> >>> > > >> >> >>> > https://svn.apache.org/repos/private/foundation/board > >> >> >>> > > >> >> >>> > Reports can also be posted using the online agenda tool: > >> >> >>> > > >> >> >>> > https://whimsy.apache.org/board/agenda/2017-08-16/Fluo > >> >> >>> > > >> >> >>> > Your report should also be sent in plain-text format to > >> >> >>> > [email protected] > >> >> >>> > with a Subject line that follows the below format: > >> >> >>> > > >> >> >>> > Subject: [REPORT] Fluo - August 2017 > >> >> >>> > > >> >> >>> > Cutting and pasting directly from a Wiki is not acceptable due > to > >> >> >>> > formatting > >> >> >>> > issues. Line lengths should be limited to 77 characters. > >> >> >>> > > >> >> >>> > Chairs may use the Apache Reporter Service [4] to help them > >> compile > >> >> >>> > and > >> >> >>> > submit a board report. > >> >> >>> > > >> >> >>> > > >> >> >>> > Resolutions > >> >> >>> > ----------- > >> >> >>> > > >> >> >>> > There are several templates for use for various Board > resolutions. > >> >> >>> > They can be found in [3] and you are encouraged to use them. > It is > >> >> >>> > strongly recommended that if you have a resolution before the > >> board, > >> >> >>> > you are encouraged to attend that board meeting. > >> >
