On Wed, Aug 9, 2017 at 1:34 PM, Christopher <ctubb...@apache.org> wrote:
> On Wed, Aug 9, 2017 at 10:41 AM Keith Turner <ke...@deenlo.com> wrote:
>
>> On Tue, Aug 8, 2017 at 7:36 PM, Christopher <ctubb...@apache.org> wrote:
>> > On Tue, Aug 8, 2017 at 5:47 PM Keith Turner <ke...@deenlo.com> wrote:
>> >>
>> >> For reference I used the following to collect stats
>> >>
>> >>
>> >>
>> >>
>> https://github.com/issues?utf8=%E2%9C%93&q=is%3Aissue+repo%3Aapache%2Ffluo+repo%3Aapache%2Ffluo-recipes+repo%3Aapache%2Ffluo-website+closed%3A%3E%3D2017-05-01+
>> >>
>> >>
>> https://github.com/issues?utf8=%E2%9C%93&q=is%3Aissue+repo%3Aapache%2Ffluo+repo%3Aapache%2Ffluo-recipes+repo%3Aapache%2Ffluo-website+created%3A%3E%3D2017-05-01+
>> >>
>> >>
>> https://github.com/issues?utf8=%E2%9C%93&q=is%3Apr+repo%3Aapache%2Ffluo+repo%3Aapache%2Ffluo-recipes+repo%3Aapache%2Ffluo-website+closed%3A%3E%3D2017-05-01+
>> >>
>> >>
>> https://github.com/issues?utf8=%E2%9C%93&q=is%3Apr+repo%3Aapache%2Ffluo+repo%3Aapache%2Ffluo-recipes+repo%3Aapache%2Ffluo-website+created%3A%3E%3D2017-05-01+
>> >> git log --since 2017-05-01 | grep Author | sort -u
>> >>
>> >> On Tue, Aug 8, 2017 at 5:46 PM, Keith Turner <ke...@deenlo.com> wrote:
>> >> > Here is a draft of the report
>> >> >
>> >> > ## Description:
>> >> >
>> >> >  - Apache Fluo is an open source implementation of Percolator (which
>> >> > populates
>> >> >    Google's search index) for Apache Accumulo. With Fluo, users can
>> >> > continuously
>> >> >    join new data into large existing data sets without reprocessing
>> all
>> >> > data.
>> >> >    Unlike batch and streaming frameworks, Fluo offers much lower
>> latency
>> >> > and can
>> >> >    operate on extremely large data sets.
>> >
>> >
>> > I'm not a big fan of this description. There's some elements of debatable
>> > accuracy and relevancy. I suggest dropping the history/background and
>>
>> What do you think is inaccurate?
>>
>>

Good feedback and discussion, definitely influenced what I put in the
report.  I never responded because I was in a time crunch trying to
get the report out the door.  I am going to open an issue to improve
the intro on the website based in this discussion.

> It's not that it's inaccurate... it's that it's temporal. Saying that it
> offers "much lower latency" compared to others is not likely to be a
> resilient statement as advances in those other frameworks progress. It may

I think its reasonable to write things that one believes are presently
true.  If there is a concern can always explicitly state its subject
to change.

> be true... but it could not be true at any given moment. Same with "can
> operate on extremely large data sets"... it's not necessarily accurate to
> say this is "unlike" others, because others may be able to do that, too, at
> any given moment.
>
> More importantly, I think, is whether or not these are relevant. Our docs
> can explain more details about history, background, prior technologies, and
> comparisons with other things. Those things should not distract from
> answering "What is *this* in front of me?" question. I especially think

My goal for contextualizing was to explain "what is this in front of
me" in relative terms.  Sometimes its easier to understand a thing
relative to other things that you know.

> this or the board report... the board isn't going to want to see an
> explanation of what Percolator is, or an explanation of limitations in
> other software... they just want a reminder of what this particular Apache
> project is about.
>
>
>>
>> > comparison stuff, and using a more "get to the point" description which
>> lets
>> > users know why they should care about Fluo, like:
>> >
>> > Apache Fluo is a distributed processing system which allows users to
>> > continuously and incrementally join new data with extremely large
>> existing
>> > data sets without reprocessing all the data. It provides low-latency,
>> > high-throughput data processing into Apache Accumulo, with cross-node
>> > transaction support and a notification system to automatically process
>> new
>> > data with a user-defined workflow.
>>
>>
>> Thinking about this some more base on your feedback, I think there are
>> three different aspects that are important to communicate to someone
>> trying to quickly decide if they should look into this further.
>>
>>  1 . What capabilities it offers to users
>>  2. How it works
>>  3. Context, how does it compare to other big data technologies.
>>
>> Below is a rough outline of an attempt to communicate these three aspects.
>>
>> Intro :
>>
>>   Apache Fluo is a distributed data processing system built on Apache
>> Accumulo.
>>
>> Capabilities :
>>
>>   Fluo allows users to continuously join new data into large existing
>> data sets without reprocessing all data.  With Fluo, users can keep
>> multiple dependent derived data sets up to date as new data arrives.
>> Changes to derived data sets can be emitted to external query or
>> analytics systems.
>>
>> How it works :
>>
>>   Fluo achieves this by offering the ability to execute user defined
>> cross node transactions when data changes.
>>
>> Context :
>>
>>   Fluo offers much lower latency than batch frameworks and can operate
>> on larger data sets than streaming frameworks.
>>
>>
> I like the way you broke it down, but I think the result is too long for a
> description. I don't think the "Context" portion should be included. The
> "Capabilities" section reads like a follow-up paragraph, which begins going
> into more details.
> (also, I like my wording better. ;)
>
> I do think the way you've broken it down would be great for the front page
> of Fluo's website, though, in this sectioned format. But, the description
> should be just a "taste", so the smaller and more to the point it is, the
> better.
>
> In any case, for the purposes of this report, please select whatever
> description you think best. We can continue to discuss how best to present
> Fluo to others through improved description, etc., separately from the
> board report.
>
>
>>
>> >
>> >
>> >>
>> >> >
>> >> > ## Issues:
>> >> >
>> >> >  - There are no issues requiring board attention at this time.
>> >> >
>> >> > ## Activity:
>> >> >
>> >> >  - Released versions of Fluo are tightly coupled with YARN+Twill for
>> >> > launching
>> >> >    services on a cluster.  Work is currently under way to break this
>> >> > tight
>> >> >    coupling inorder to support YARN+Twill, Mesos, and Kubernetes.
>> >> >  - Fluo implements an immutable byte array wrapper in its API.  Work
>> on
>> >> > moving
>> >> >    this to its own sub-project is underway.  The goal of this is to
>> >> > create
>> >> >    something analogous to String for bytes that is suitable for use in
>> >> > other APIs.
>> >> >    This goal was discussed on an OpenJDK list and there was agreement
>> >> > Java needs a
>> >> >    project like this until Java defines a bigger story for
>> immutability.
>> >> >
>> >
>> >
>> > Could mention that we're still working through INFRA tasks for graduation
>> > and are mostly transitioned.
>> > (
>> https://issues.apache.org/jira/servicedesk/agent/INFRA/issue/INFRA-14714)
>> >
>> >
>> >>
>> >> > ## Health report:
>> >> >
>> >> >  - Within the past three months
>> >> >     - 46 issues were opened and 23 closed
>> >> >     - 64 pull request were opened and 56 were closed
>> >> >     - 69 commits were made by 6 authors, 2 authors were not committers
>> >> >
>> >> > Contributions from non-committers.  GH stats.  35 PRs in last 3
>> >> > months.. 31 closed.
>> >> >
>> >> > ## PMC changes:
>> >> >
>> >> >  - Currently 8 PMC members.
>> >> >  - Chris McTague was added to the PPMC on --- shortly before
>> graduation.
>> >> >
>> >
>> >
>> > Chris accepted membership on 2017-05-16, and had his ICLA filed and
>> account
>> > created on 2017-05-31.
>> > Should mention explicitly that this happened after our last/final
>> Incubator
>> > report, and not just before graduation.
>> >
>> >>
>> >> > ## Committer base changes:
>> >> >
>> >> >  - Currently 8 committers.
>> >> >  - Chris McTague was added as a committer --- shortly before
>> graduation.
>> >> >
>> >
>> >
>> > Would be good to clarify that PMC == committer, at least for now. This
>> is a
>> > common question from the board.
>> >
>> >>
>> >> > ## Releases:
>> >> >
>> >> >  - fluo-1.1.0-incubating was released on Mon Jun 12 2017
>> >> >  - fluo-recipes-1.1.0-incubating was released on Thu Jun 22 2017
>> >> >
>> >> > On Mon, Aug 7, 2017 at 5:19 PM, Christopher <ctubb...@apache.org>
>> wrote:
>> >> >> I think INFRA tasks are done.
>> >> >>
>> >> >> On Wed, Aug 2, 2017 at 10:52 AM Keith Turner <ke...@deenlo.com>
>> wrote:
>> >> >>>
>> >> >>> I started working on this and thinking about it a bit.
>> >> >>>
>> >> >>> I added our release dates at https://reporter.apache.org/
>> >> >>>
>> >> >>> Currently thinking of mentioning the following that has happened
>> since
>> >> >>> our last report to IPMC.  Thinking of linking to our last IPMC
>> report.
>> >> >>>
>> >> >>>  * Our new member Chris
>> >> >>>  * Our new releases
>> >> >>>  * We are waiting on INFRA, link to issues... maybe this will change
>> >> >>> before its due next Wed
>> >> >>>  * Discussing the new work to support multiple ways to launch Fluo
>> and
>> >> >>> deprecating our tight coupling to YARN in core.
>> >> >>>
>> >> >>> On Fri, Jul 28, 2017 at 7:59 AM, Brett Porter <br...@apache.org>
>> >> >>> wrote:
>> >> >>> > This email was sent on behalf of the ASF Board.  It is an initial
>> >> >>> > reminder to
>> >> >>> > give you plenty of time to prepare the report.
>> >> >>> >
>> >> >>> > According to board records, you are listed as the chair of a
>> >> >>> > committee
>> >> >>> > that is
>> >> >>> > due to submit a report this month. [1] [2]
>> >> >>> >
>> >> >>> > The meeting is scheduled for Wed, 16 Aug 2017 at 10:30 PDT and the
>> >> >>> > deadline for
>> >> >>> > submitting your report is 1 full week prior to that (Wed Aug 9th)!
>> >> >>> >
>> >> >>> > Meeting times in other time zones:
>> >> >>> >
>> >> >>> >
>> >> >>> >
>> >> >>> >
>> http://www.timeanddate.com/worldclock/fixedtime.html?iso=2017-08-16T10:30:00&msg=ASF+Board+Meeting&p1=137
>> >> >>> >
>> >> >>> > Please submit your report with sufficient time to allow the board
>> >> >>> > members
>> >> >>> > to review and digest. Again, the very latest you should submit
>> your
>> >> >>> > report
>> >> >>> > is 1 full week (7days) prior to the board meeting (Wed Aug 9th).
>> >> >>> >
>> >> >>> > If you feel that an error has been made, please consult [1] and if
>> >> >>> > there
>> >> >>> > is still an issue then contact the board directly.
>> >> >>> >
>> >> >>> > As always, PMC chairs are welcome to attend the board meeting.
>> >> >>> >
>> >> >>> > Thanks,
>> >> >>> > The ASF Board
>> >> >>> >
>> >> >>> > [1] -
>> >> >>> >
>> >> >>> >
>> https://svn.apache.org/repos/private/committers/board/committee-info.txt
>> >> >>> > [2] -
>> >> >>> >
>> https://svn.apache.org/repos/private/committers/board/calendar.txt
>> >> >>> > [3] -
>> >> >>> > https://svn.apache.org/repos/private/committers/board/templates
>> >> >>> > [4] - https://reporter.apache.org/
>> >> >>> >
>> >> >>> >
>> >> >>> > Submitting your Report
>> >> >>> > ----------------------
>> >> >>> >
>> >> >>> > Full details about the process and schedule are in [1].
>> >> >>> >
>> >> >>> > The report should be committed to the meeting agenda in the board
>> >> >>> > directory
>> >> >>> > in the foundation repository, trying to keep a similar format to
>> the
>> >> >>> > others.
>> >> >>> > This can be found at:
>> >> >>> >
>> >> >>> >   https://svn.apache.org/repos/private/foundation/board
>> >> >>> >
>> >> >>> > Reports can also be posted using the online agenda tool:
>> >> >>> >
>> >> >>> >   https://whimsy.apache.org/board/agenda/2017-08-16/Fluo
>> >> >>> >
>> >> >>> > Your report should also be sent in plain-text format to
>> >> >>> > bo...@apache.org
>> >> >>> > with a Subject line that follows the below format:
>> >> >>> >
>> >> >>> >   Subject: [REPORT] Fluo - August 2017
>> >> >>> >
>> >> >>> > Cutting and pasting directly from a Wiki is not acceptable due to
>> >> >>> > formatting
>> >> >>> > issues. Line lengths should be limited to 77 characters.
>> >> >>> >
>> >> >>> > Chairs may use the Apache Reporter Service [4] to help them
>> compile
>> >> >>> > and
>> >> >>> > submit a board report.
>> >> >>> >
>> >> >>> >
>> >> >>> > Resolutions
>> >> >>> > -----------
>> >> >>> >
>> >> >>> > There are several templates for use for various Board resolutions.
>> >> >>> > They can be found in [3] and you are encouraged to use them. It is
>> >> >>> > strongly recommended that if you have a resolution before the
>> board,
>> >> >>> > you are encouraged to attend that board meeting.
>>

Reply via email to