Re: Proposal for Heron API Server

Karthik Ramasamy Tue, 25 Jul 2017 11:55:25 -0700

Bill - 

The main driving factors for API server are the following -


- How can heron jobs be managed using purely API instead of using CLI without 
any dependency?

- A single place for maintaining config including keys (which you don’t want to 
expose to every client)

- Reduce the installation steps needed

- Provide authentication support

Rest of my responses are inlined as k>

> On Jul 25, 2017, at 10:15 AM, Bill Graham <[email protected]> wrote:
> 
> It's not entirely accurate that Heron's deployment mode is only library
> mode. The Heron scheduler could be implemented to either manage resource
> scheduling from the client (i.e., Aurora Scheduler) or to run the
> scheduling logic on the scheduler framework (i.e., Yarn). ISchedulerClient
> has both LIbrarySchedulerClient and HttpServiceSchedulerClient for these
> two use cases. These are the modes for scheduling single topologies
> components though, and not about managing a centralized scheduling service
> for multi-tenant usage which this proposal is about. Basically, we already
> have Library and Service modes as terminology in the existing codebase, so
> we shouldn't overload the concepts with a new definition of service mode.

k>I agree I might be overloading the terminology here. Whether the scheduler is 
run 
in library mode vs in the schedule framework is independent of the API server. 
API
server is not a scheduling service - it just a REST end point server that 
translates 
REST API into actions. Perhaps a change of terminology might make it easier to 
understand.
Any suggestions?

> If config distribution is the main issue, have we explored adding support
> for fetching configs from a repository, just as we upload and fetch the
> binary?

k>We did explore this aspect of having a config in a central place. However, 
there are issues with this approach

- Heron cli have to download every time it has to 
submit/kill/activate/deactivate the topologies. Alternatively,
the config can be cached but it require invalidation and refresh periodically 
at the client side - which could lead
to issues.

- All the keys and important stuff could be exposed on the client (if you are 
working with cloud environments)

- If we have to manage the jobs programmatically including 
submission/killing/updating/activating/deactivating, it
introduces a dependency - such as downloading config before submitting making 
it cumbersome for programmers.

> 
> One concern about adding a scheduling service, is that it creates yet
> another service to be maintained, and it increases the matrix of modes of
> deployment available which adds complexity. For example today Aurora
> topologies can be submitted in local mode only, but they can be updated in
> local or service mode. YARN does both submit and update in service mode
> today. With this additional service, we would need to support those modes,
> plus those modes when run behind yet another service. The combination of
> modes gets complex because we now anywhere from 0..2 potential layers of
> services to go through.

k>As pointed out above, this is not a scheduling service - it is just a rest 
end point. The API
service will be deployed as yet another job similar to heron-ui and 
heron-tracker. This service
will be stateless and hence it will be restarted by the scheduler if it dies - 
which means it is fault
tolerant. We can run multiple instance of the service as well for scalability.

Furthermore, the API server will preserve those deployment modes for Aurora and 
YARN - independent
of whether you deploy using API server or directly from the dev machine (like 
we have now).


> This approach also requires the design of a delegated auth mechanism. For
> example if the deploy service is running as a shared account, how will it
> delegate auth on behalf of the user who is deploying the topology? If we go
> down this path, we'd need to design for this.

k>As I mentioned earlier, one of the motivations for API server is to implement 
some kind of authentication
- Kerberos/TLS/LDAP. However, the first phase will be providing the 
functionality followed by the 2nd phase
which includes an authentication mechanism.

> I also share Maosong's concern of merging the tracker into the api service.
> The design of the system will be more clear and easy to maintain/manage if
> each system could live independently. If the goal is to make it easier for
> administrators to manage all at once, I'd suggest we handle that with admin
> management scripts that could simplify common tasks without merging the
> service code.

k>In fact, I would argue the other way around - since the main focus of the API 
server to provide REST api

- Why not move all the API’s into one single service rather having two?

- Furthermore, the current tracker uses state manager for getting metadata etc. 
Since tracker
uses python, the state manager functionality needs to be duplicated in python 
and Java. 

With API server the plan is to write in Java and we can eliminate all the 
python code for state manager
thereby reducing duplicate functionality in different languages. Our initial 
focus to get this service rolled
out with the first phase of API submit/kill/update/activate/deactivate and in 
the second phase we can 
merge the tracker.

Note that the introduction of server does not change in any way the current 
mode of deployment.

cheers
/karthik

> On Mon, Jul 24, 2017 at 6:27 PM, Karthik Ramasamy <[email protected]>
> wrote:
> 
>> 1st version of the api server will support the following commands
>> 
>> - submit
>> - kill
>> - update
>> - activate
>> - deactivate
>> 
>> We are designing API server to be stateless and it will run as a job in the
>> scheduler (similar to tracker and UI). With this approach, there is no need
>> to worry about availability issues.
>> 
>> cheers
>> /karthik
>> 
>> On Mon, Jul 24, 2017 at 5:43 PM, Fu Maosong <[email protected]> wrote:
>> 
>>> I like the idea of *service mode* for heron.
>>> 
>>> But we need to be more cautious about merging tracker into API Server,
>>> since it can easily bring scalability and availability issues.
>>> BTW, storm's nimbus serves both topology management requests as well as
>>> metrics requests, which is kind of "merging tracker into API server". We
>>> can learn the pros&cons of such design from it.
>>> 
>>> 
>>> 2017-07-24 16:57 GMT-07:00 Karthik Ramasamy <[email protected]>:
>>> 
>>>> *Rationale*:
>>>> 
>>>> Currently, Heron supports a single mode of deployment called library
>>> mode.
>>>> Library mode requires several steps and client side configuration which
>>>> could be intensive. Hence, we want to support another mode called
>> service
>>>> mode for simplified deployment.
>>>> 
>>>> *Library Mode:*
>>>> 
>>>> With Heron, the current mode of deployment is called library mode. This
>>>> mode does not require any services running for Heron to deploy which
>> is a
>>>> huge advantage. However, it requires several configuration to be in the
>>>> client side. Because of this administering becomes harder - especially
>>>> maintaining the configuration and distributing them when the
>>> configuration
>>>> is changed. While this is possible for a bigger teams with dedicated
>>>> dev-ops team, it might be overhead for medium and smaller teams.
>>>> Furthermore, this mode of deployment does not have an API to
>>>> submit/kill/activate/deactivate programmatically.
>>>> 
>>>> *Service Mode:*
>>>> 
>>>> In this mode, an api server will be running as a service. This service
>>> will
>>>> be run as yet another job in the scheduler so that it will be restarted
>>>> during machine and process failures thereby providing fault tolerance.
>>> This
>>>> api server will maintain the configuration and heron cli will be
>>> augmented
>>>> to use the rest API to submit/kill/activate/deactivate the topologies
>> in
>>>> this mode. The advantage of this mode is it simplifies deployment but
>>>> requires running a service.
>>>> 
>>>> *Merging Tracker into API Server:*
>>>> 
>>>> Current, Heron tracker written in python duplicates the state manager
>>> code
>>>> in python as well. The API server will support the heron tracker api in
>>>> addition to topologies api. Depending on the mode of the deployment,
>> the
>>>> api server can be deployed in one of the modes - library mode (which
>>>> exposes only the tracker API) and services mode (which exposes both the
>>>> tracker + api server). Initially, the tracker and api server will be in
>>>> separate directory until great amount of testing is done. Once it is
>>>> completed, we can think about cutting over to entirely using API
>> server.
>>>> 
>>>> This change will not affect any of the existing deployments and it will
>>> be
>>>> backward compatible.
>>>> 
>>> 
>>> 
>>> 
>>> --
>>> With my best Regards
>>> ------------------
>>> Fu Maosong
>>> Twitter Inc.
>>> Mobile: +001-415-244-7520
>>> 
>>

Re: Proposal for Heron API Server

Reply via email to