Re: 1.7 release timeline

Josh Elser Tue, 07 Oct 2014 20:07:01 -0700

Some more information on the subject. A few of us got together toco-work today and had an informal discussion on our individual interestsfor 1.7. Summary incoming:


*Monitor re-write*

- I was pushing this one, I think the monitor still has merit despitethe goal of the desire of other to just integrate with external systems

  - I have some code in place, but still needs more work.

- Is a unified/stable "metrics" API necessary for integration w/external tools? (or is JMX enough?)

  - An API would probably be a more usable interface than JMX

- Such an API should be stateless (no log aggregation nor statisticsover time)

  - Monitor still has uses for standalone/small deployments

- If still being used, MVC approach would ease testing and additionof new data and views

  - Not necessary to hold up 1.7.0 from happening


*Revisit performance*

- Eric mentioned that he wants to spend some time running someAccumulo benchmarks, specifically YCSB.

  - Lots of related topics were mentioned that might be relevant

* Other HDFS block cache implementations (HBase has lots of nicebenchmarks, could learn from them)

    * A WIP patch for metadata updates have some promise (ACCUMULO-2889)
    * Collapse iterator stack (ACCUMULO-3079)

* Possible improvements to Scanner for single-batch cases (reduce afew RPCs to one RPC)

  - Actual changes made likely to be found via investigation
  - Changing default conf values where relevant also mentioned

*Distributed Tracing*

- Billie has been spending some time working w/ some people onreplacing Cloudtrace with HTrace- Mentioned that HTrace shares a remarkable amount of similarity withour existing tracing library

  - Upstream efforts in Hadoop-3 to integrate htrace to DN/NN calls

- Some consideration given to replace traceserver with zipkin howevernot required for the first implementation


*Decouple MiniAccumuloCluster from ITs*
  - Another one I've started working on
  - ITs are really great, we have a lot for really good cases
  - Running them against a real instance in infeasible right now

- Would be good to express as many as possible in terms of only usingInstance+Connector- Christopher mentioned possible benefit outside of tests to usingthe accumulo-maven-plugin as the "shim" between a real instance and aMiniAccumuloCluster- Some tests are written explicitly for MAC and must be ignored orrun against a MAC when a real instance is available.


*Upgrade test script*

- Keith mentioned there's some code from John McNamee that might helptesting upgrade paths


*Hadoop Metrics2*
  - Metrics2 is the current library in use by Hadoop

- Integration gives us a lot more flexibility, notably goodintegration with Ganglia provided (ACCUMULO-1817)- No one expressed interest in working on this directly (potential toslip)


*Deprecate MockAccumulo?*
  - Talked about this for 1.6, decided against
  - It's now 1.7. Is it time?
  - Remember, deprecate != removal

There are some outstanding things we need to investigate more:

- Is improved JMX or metrics2 impl sufficient for integration withexternal monitoring tools? (considerations: nagios, ganglia, statsd,collectd, carbon, riemann... others?)- BatchWriter has some weird cases around error handling. Is intendedthat it survives failures, but that's very much not the case. Shouldprobably be fixed around a major release, but need to figure out howexactly to fix it (needs someone to get behind it)

If people want to continue discussion on these, let's break offindividual topics into their own thread for clarity (and my sanity).


Also, anyone have a desire to be "release manager"?

- Josh

Josh Elser wrote:

Thanks, John.

I was thinking about trying to gun for January time-frame for a release.
I'd love to say before 2014 is over, but that probably just won't happen
for a major release with the holidays.

For 1.7 right now, I see the following "bigger" items (correct me where
I'm wrong):

* Replication (done)
* Upgrade rules/guarantees (proposed)
* Replace cloudtrace (in-progress)
* Rewrite monitor, include REST service (in-progress)
* Drop Hadoop 1 support (proposed)
* Decouple MiniAccumulo from ITs (in-progress)
* Other minicluster types: in-process, shim to real instance (in-progress)
* Support Hadoop metrics2 (proposed)
* A few WAL/metadata related performance improvements (in-progress)

Also, would be good to check the In-Progress state issues on JIRA. What
do people think?

John Vines wrote:

Moving this to it's own thread...

On Mon, Oct 6, 2014 at 5:54 PM, Mike Drob<[email protected]> wrote:

Related: Do we have a release timeline for 1.7?

Re: 1.7 release timeline

Reply via email to