On Wed, Aug 9, 2017 at 1:34 PM, Christopher <[email protected]> wrote: > On Wed, Aug 9, 2017 at 10:41 AM Keith Turner <[email protected]> wrote: > >> On Tue, Aug 8, 2017 at 7:36 PM, Christopher <[email protected]> wrote: >> > On Tue, Aug 8, 2017 at 5:47 PM Keith Turner <[email protected]> wrote: >> >> >> >> For reference I used the following to collect stats >> >> >> >> >> >> >> >> >> https://github.com/issues?utf8=%E2%9C%93&q=is%3Aissue+repo%3Aapache%2Ffluo+repo%3Aapache%2Ffluo-recipes+repo%3Aapache%2Ffluo-website+closed%3A%3E%3D2017-05-01+ >> >> >> >> >> https://github.com/issues?utf8=%E2%9C%93&q=is%3Aissue+repo%3Aapache%2Ffluo+repo%3Aapache%2Ffluo-recipes+repo%3Aapache%2Ffluo-website+created%3A%3E%3D2017-05-01+ >> >> >> >> >> https://github.com/issues?utf8=%E2%9C%93&q=is%3Apr+repo%3Aapache%2Ffluo+repo%3Aapache%2Ffluo-recipes+repo%3Aapache%2Ffluo-website+closed%3A%3E%3D2017-05-01+ >> >> >> >> >> https://github.com/issues?utf8=%E2%9C%93&q=is%3Apr+repo%3Aapache%2Ffluo+repo%3Aapache%2Ffluo-recipes+repo%3Aapache%2Ffluo-website+created%3A%3E%3D2017-05-01+ >> >> git log --since 2017-05-01 | grep Author | sort -u >> >> >> >> On Tue, Aug 8, 2017 at 5:46 PM, Keith Turner <[email protected]> wrote: >> >> > Here is a draft of the report >> >> > >> >> > ## Description: >> >> > >> >> > - Apache Fluo is an open source implementation of Percolator (which >> >> > populates >> >> > Google's search index) for Apache Accumulo. With Fluo, users can >> >> > continuously >> >> > join new data into large existing data sets without reprocessing >> all >> >> > data. >> >> > Unlike batch and streaming frameworks, Fluo offers much lower >> latency >> >> > and can >> >> > operate on extremely large data sets. >> > >> > >> > I'm not a big fan of this description. There's some elements of debatable >> > accuracy and relevancy. I suggest dropping the history/background and >> >> What do you think is inaccurate? >> >>
Good feedback and discussion, definitely influenced what I put in the report. I never responded because I was in a time crunch trying to get the report out the door. I am going to open an issue to improve the intro on the website based in this discussion. > It's not that it's inaccurate... it's that it's temporal. Saying that it > offers "much lower latency" compared to others is not likely to be a > resilient statement as advances in those other frameworks progress. It may I think its reasonable to write things that one believes are presently true. If there is a concern can always explicitly state its subject to change. > be true... but it could not be true at any given moment. Same with "can > operate on extremely large data sets"... it's not necessarily accurate to > say this is "unlike" others, because others may be able to do that, too, at > any given moment. > > More importantly, I think, is whether or not these are relevant. Our docs > can explain more details about history, background, prior technologies, and > comparisons with other things. Those things should not distract from > answering "What is *this* in front of me?" question. I especially think My goal for contextualizing was to explain "what is this in front of me" in relative terms. Sometimes its easier to understand a thing relative to other things that you know. > this or the board report... the board isn't going to want to see an > explanation of what Percolator is, or an explanation of limitations in > other software... they just want a reminder of what this particular Apache > project is about. > > >> >> > comparison stuff, and using a more "get to the point" description which >> lets >> > users know why they should care about Fluo, like: >> > >> > Apache Fluo is a distributed processing system which allows users to >> > continuously and incrementally join new data with extremely large >> existing >> > data sets without reprocessing all the data. It provides low-latency, >> > high-throughput data processing into Apache Accumulo, with cross-node >> > transaction support and a notification system to automatically process >> new >> > data with a user-defined workflow. >> >> >> Thinking about this some more base on your feedback, I think there are >> three different aspects that are important to communicate to someone >> trying to quickly decide if they should look into this further. >> >> 1 . What capabilities it offers to users >> 2. How it works >> 3. Context, how does it compare to other big data technologies. >> >> Below is a rough outline of an attempt to communicate these three aspects. >> >> Intro : >> >> Apache Fluo is a distributed data processing system built on Apache >> Accumulo. >> >> Capabilities : >> >> Fluo allows users to continuously join new data into large existing >> data sets without reprocessing all data. With Fluo, users can keep >> multiple dependent derived data sets up to date as new data arrives. >> Changes to derived data sets can be emitted to external query or >> analytics systems. >> >> How it works : >> >> Fluo achieves this by offering the ability to execute user defined >> cross node transactions when data changes. >> >> Context : >> >> Fluo offers much lower latency than batch frameworks and can operate >> on larger data sets than streaming frameworks. >> >> > I like the way you broke it down, but I think the result is too long for a > description. I don't think the "Context" portion should be included. The > "Capabilities" section reads like a follow-up paragraph, which begins going > into more details. > (also, I like my wording better. ;) > > I do think the way you've broken it down would be great for the front page > of Fluo's website, though, in this sectioned format. But, the description > should be just a "taste", so the smaller and more to the point it is, the > better. > > In any case, for the purposes of this report, please select whatever > description you think best. We can continue to discuss how best to present > Fluo to others through improved description, etc., separately from the > board report. > > >> >> > >> > >> >> >> >> > >> >> > ## Issues: >> >> > >> >> > - There are no issues requiring board attention at this time. >> >> > >> >> > ## Activity: >> >> > >> >> > - Released versions of Fluo are tightly coupled with YARN+Twill for >> >> > launching >> >> > services on a cluster. Work is currently under way to break this >> >> > tight >> >> > coupling inorder to support YARN+Twill, Mesos, and Kubernetes. >> >> > - Fluo implements an immutable byte array wrapper in its API. Work >> on >> >> > moving >> >> > this to its own sub-project is underway. The goal of this is to >> >> > create >> >> > something analogous to String for bytes that is suitable for use in >> >> > other APIs. >> >> > This goal was discussed on an OpenJDK list and there was agreement >> >> > Java needs a >> >> > project like this until Java defines a bigger story for >> immutability. >> >> > >> > >> > >> > Could mention that we're still working through INFRA tasks for graduation >> > and are mostly transitioned. >> > ( >> https://issues.apache.org/jira/servicedesk/agent/INFRA/issue/INFRA-14714) >> > >> > >> >> >> >> > ## Health report: >> >> > >> >> > - Within the past three months >> >> > - 46 issues were opened and 23 closed >> >> > - 64 pull request were opened and 56 were closed >> >> > - 69 commits were made by 6 authors, 2 authors were not committers >> >> > >> >> > Contributions from non-committers. GH stats. 35 PRs in last 3 >> >> > months.. 31 closed. >> >> > >> >> > ## PMC changes: >> >> > >> >> > - Currently 8 PMC members. >> >> > - Chris McTague was added to the PPMC on --- shortly before >> graduation. >> >> > >> > >> > >> > Chris accepted membership on 2017-05-16, and had his ICLA filed and >> account >> > created on 2017-05-31. >> > Should mention explicitly that this happened after our last/final >> Incubator >> > report, and not just before graduation. >> > >> >> >> >> > ## Committer base changes: >> >> > >> >> > - Currently 8 committers. >> >> > - Chris McTague was added as a committer --- shortly before >> graduation. >> >> > >> > >> > >> > Would be good to clarify that PMC == committer, at least for now. This >> is a >> > common question from the board. >> > >> >> >> >> > ## Releases: >> >> > >> >> > - fluo-1.1.0-incubating was released on Mon Jun 12 2017 >> >> > - fluo-recipes-1.1.0-incubating was released on Thu Jun 22 2017 >> >> > >> >> > On Mon, Aug 7, 2017 at 5:19 PM, Christopher <[email protected]> >> wrote: >> >> >> I think INFRA tasks are done. >> >> >> >> >> >> On Wed, Aug 2, 2017 at 10:52 AM Keith Turner <[email protected]> >> wrote: >> >> >>> >> >> >>> I started working on this and thinking about it a bit. >> >> >>> >> >> >>> I added our release dates at https://reporter.apache.org/ >> >> >>> >> >> >>> Currently thinking of mentioning the following that has happened >> since >> >> >>> our last report to IPMC. Thinking of linking to our last IPMC >> report. >> >> >>> >> >> >>> * Our new member Chris >> >> >>> * Our new releases >> >> >>> * We are waiting on INFRA, link to issues... maybe this will change >> >> >>> before its due next Wed >> >> >>> * Discussing the new work to support multiple ways to launch Fluo >> and >> >> >>> deprecating our tight coupling to YARN in core. >> >> >>> >> >> >>> On Fri, Jul 28, 2017 at 7:59 AM, Brett Porter <[email protected]> >> >> >>> wrote: >> >> >>> > This email was sent on behalf of the ASF Board. It is an initial >> >> >>> > reminder to >> >> >>> > give you plenty of time to prepare the report. >> >> >>> > >> >> >>> > According to board records, you are listed as the chair of a >> >> >>> > committee >> >> >>> > that is >> >> >>> > due to submit a report this month. [1] [2] >> >> >>> > >> >> >>> > The meeting is scheduled for Wed, 16 Aug 2017 at 10:30 PDT and the >> >> >>> > deadline for >> >> >>> > submitting your report is 1 full week prior to that (Wed Aug 9th)! >> >> >>> > >> >> >>> > Meeting times in other time zones: >> >> >>> > >> >> >>> > >> >> >>> > >> >> >>> > >> http://www.timeanddate.com/worldclock/fixedtime.html?iso=2017-08-16T10:30:00&msg=ASF+Board+Meeting&p1=137 >> >> >>> > >> >> >>> > Please submit your report with sufficient time to allow the board >> >> >>> > members >> >> >>> > to review and digest. Again, the very latest you should submit >> your >> >> >>> > report >> >> >>> > is 1 full week (7days) prior to the board meeting (Wed Aug 9th). >> >> >>> > >> >> >>> > If you feel that an error has been made, please consult [1] and if >> >> >>> > there >> >> >>> > is still an issue then contact the board directly. >> >> >>> > >> >> >>> > As always, PMC chairs are welcome to attend the board meeting. >> >> >>> > >> >> >>> > Thanks, >> >> >>> > The ASF Board >> >> >>> > >> >> >>> > [1] - >> >> >>> > >> >> >>> > >> https://svn.apache.org/repos/private/committers/board/committee-info.txt >> >> >>> > [2] - >> >> >>> > >> https://svn.apache.org/repos/private/committers/board/calendar.txt >> >> >>> > [3] - >> >> >>> > https://svn.apache.org/repos/private/committers/board/templates >> >> >>> > [4] - https://reporter.apache.org/ >> >> >>> > >> >> >>> > >> >> >>> > Submitting your Report >> >> >>> > ---------------------- >> >> >>> > >> >> >>> > Full details about the process and schedule are in [1]. >> >> >>> > >> >> >>> > The report should be committed to the meeting agenda in the board >> >> >>> > directory >> >> >>> > in the foundation repository, trying to keep a similar format to >> the >> >> >>> > others. >> >> >>> > This can be found at: >> >> >>> > >> >> >>> > https://svn.apache.org/repos/private/foundation/board >> >> >>> > >> >> >>> > Reports can also be posted using the online agenda tool: >> >> >>> > >> >> >>> > https://whimsy.apache.org/board/agenda/2017-08-16/Fluo >> >> >>> > >> >> >>> > Your report should also be sent in plain-text format to >> >> >>> > [email protected] >> >> >>> > with a Subject line that follows the below format: >> >> >>> > >> >> >>> > Subject: [REPORT] Fluo - August 2017 >> >> >>> > >> >> >>> > Cutting and pasting directly from a Wiki is not acceptable due to >> >> >>> > formatting >> >> >>> > issues. Line lengths should be limited to 77 characters. >> >> >>> > >> >> >>> > Chairs may use the Apache Reporter Service [4] to help them >> compile >> >> >>> > and >> >> >>> > submit a board report. >> >> >>> > >> >> >>> > >> >> >>> > Resolutions >> >> >>> > ----------- >> >> >>> > >> >> >>> > There are several templates for use for various Board resolutions. >> >> >>> > They can be found in [3] and you are encouraged to use them. It is >> >> >>> > strongly recommended that if you have a resolution before the >> board, >> >> >>> > you are encouraged to attend that board meeting. >>
