Re: Scan Server discussion [WAS: Re: 2.1 Release TODO]

Dave Marion Mon, 04 Apr 2022 10:11:07 -0700

I understand the desire to see less coupling for the optional features, but
getting to that point for ScanServers (and less so for ExternalCompactions)
would be a ton of work I think. The concern that I brought up in the "2.1
Release TODOs" thread regarding planning has not been addressed. If there
was a defined path forward, then that might make it easier to see how this
feature gets added in the near-future in whatever form it takes.


Regarding the concern about the readiness of the feature branch, Keith is
doing a last pass review on the draft and then I believe we are ready to
take it out of draft state. I think it will be before the end of this week.
We have added six new integration tests and we have done some local and
cluster testing.

Regarding the concern mentioned above, "availability of time to review/test
such a big feature without delaying 2.1," I didn't realize that we had a
schedule.  Does it matter if it takes 2/4/6/8 weeks to test the 1000+
completed issues in this release? I know that we want to finish up the
2.1.0 release, but is there a target date?

On Mon, Apr 4, 2022 at 12:32 PM Christopher <ctubb...@apache.org> wrote:

> On Mon, Apr 4, 2022 at 11:50 AM Keith Turner <ke...@deenlo.com> wrote:
> >
> > On Mon, Apr 4, 2022 at 11:17 AM Christopher <ctubb...@apache.org> wrote:
> > >
> > > However, I'm reluctant to include #2422, because I don't think it's
> near
> > > ready enough, and by the time it is, it will be very last minute, and I
> > > don't want to delay 2.1 further for it. Even if it's included as an
> > > experimental feature, I think it has huge potential to be disruptive,
> or to
> > > have a lot of churn by the time people actually have a chance to
> review it
> > > thoroughly. Furthermore, I think there are possible alternatives (like
> a
> > > fully client-side implementation, based on offline scanners) that would
> > > avoid the tight coupling of a new service to Accumulo's core code. This
> >
> > There are some advantages to scan servers over direct file access to
> > consider.  One is scalability of computation, if a web server is
> > serving N client queries with scan servers those can potentially go to
> > different scan servers.  With direct file access, all N queries and
> > their iterator stacks would have to run in the web server.  Another is
> > scalability of caching/memory.  When web servers send queries to scan
> > servers using a sticky algorithm for assigning tablets to groups of
> > scan servers, it could lead to good cache utilization and sharing that
> > may not be possible when running scans directly in the web server. So
> > scan servers allow scaling cache and computations for queries
> > independently of web servers in way that may not be possible with
> > direct file access.
> >
> > Another advantage to consider is isolation.  With direct file access
> > and queries running directly in a web server, a bad query could bring
> > down a web server and lots of unrelated queries.  Having a bad query
> > bring down a scan server may be less disruptive.
> >
>
> I've forked this thread into its own discussion with a new subject
> line, because, as I suggested in my original reply, my intent was not
> to hijack the 2.1 planning thread with a discussion of the ScanServer
> implementation details.
>
> I'm fine with all those benefits (even if all the "could" and "may"
> were turned into concrete "will"). My objection is not an objection to
> the feature. It's an objection to including the feature in 2.1, based
> on:
>
> * readiness of the feature branch,
> * availability of time to review/test such a big feature without delaying
> 2.1,
> * its tight coupling to the core code in the implementation, and
> * the possibility that solutions may exist with the above benefits
> that are less tightly coupled has not yet been explored.
>
> I would be more okay with including it if:
>
> * it is ready,
> * it has been tested and reviewed by the wider community,
> * its coupling to the core Accumulo code is loosened, ideally if it's
> designed to use only API/SPI, and could be released as a separate,
> optional add-on. This might require improvements to API/SPI to expose
> the features needed to help it function. This could also be done by
> sub-classing the AccumuloClient. My concern here is the risk of
> technical debt and the extra maintenance costs of increased complexity
> for optional features that go unmaintained.
>
> We've been hurt by premature inclusion of optional/experimental
> features before that were rushed to release. No matter how awesome the
> feature is... if it's niche and optional, we should consider these
> risks and work to mitigate them. Otherwise, we'll be stuck with the
> technical debt for years to come. With a little bit of caution, we can
> make the feature available, without rushing, to satisfy the use case
> while reducing the risks.
>
> Also, one point of clarification: when I say "fully client side", I
> only mean relative to Accumulo, not necessarily in the client process.
> I'm lacking vocabulary to describe what I mean. As I understand it,
> the current client code has been modified to connect to ScanServers
> sitting off to the side of TabletServers, and the ScanServers are
> basically modified TabletServers with less functionality. What I mean
> is that instead of coupling the ScanServer to the TabletServer
> implementation, and coupling the ScanServer client to the
> AccumuloClient, there could be less coupling. The ScanServer itself
> could behave like a client to Accumulo and/or HDFS (and maybe even
> share some library code that we make public API, like RFile readers)
> and it could have its own client (this is just one very rough outline
> of an idea that could be explored). That way, the entire thing could
> be removed without any change in Accumulo's code, to make it truly
> optional (as in, optional to even have on the class path).
>

Re: Scan Server discussion [WAS: Re: 2.1 Release TODO]

Reply via email to