Re: [DISCUSS] KNOX-1006 Topology generation behavior

Philip Zampino Tue, 26 Sep 2017 10:21:41 -0700

When you say “descriptor check API”, do you mean validating only that we can 
generate correctly-formed URL(s) for the declared services? Or, are you 
proposing further validation of the service components’ configurations?


-- 
Phil
 

On 9/26/17, 12:38 PM, "larry mccay" <[email protected]> wrote:

    Yes, the difference will be a service connectivity error vs a 404 not found
    if it isn't in the topology at all.
    
    As for HIVE and cliservice - I have never seen anything other than
    cliservice be used there. It seems like it is possible to configure it to
    something else though.
    
    There is definitely a diagnosability issue with the whole deployment
    mechanism.
    
    If we fail the topology generation completely then we will get 404's for
    any service that is attempted after deployment.
    If we only omit those services that fail to resolve then we only get 404's
    on the ones that failed to resolve.
    
    Generally, when you get a 404 on a service call you are going to check the
    topology to make sure it is actually in there and configured correctly.
    For service discovery based deployments, we would check the simple
    descriptor.
    
    It would be great if we could have some realtime check of the
    discoverability of the services selected. So, that a bad configuration is
    determined up front - at least for Ambari type scenarios where a list of
    services can be selected and simple descriptor submitted.
    WARN the user that there are unresolvable services selected and they can
    continue or cancel.
    Continue would result in those services staying in the simple descriptor
    but not in the topology and discovery service monitoring would have to
    resolve later.
    Cancel would allow them to either unselect the missing services or deploy
    the services so that they can be resolved and start over.
    
    Manual editing would just assume "Continue" mode.
    
    What do you think about a descriptor check API?
    
    
    On Tue, Sep 26, 2017 at 11:45 AM, Philip Zampino <[email protected]>
    wrote:
    
    > Considering it from a behavior point of view, I mostly agree that there is
    > no difference between a service that is missing and a service that is
    > incorrect. Do you get a service connectivity error if a service isn’t
    > declared in a topology? I suspect it’s probably a 401 from Knox in that
    > case, which is different than a service with an incorrect URL.
    >
    > In some cases, we can default to what we know should work, especially when
    > there is not a valid alternative. For instance, Hive support requires the
    > http transport mode, so we can always discover the HTTP URL whether the
    > component is correctly configured for http transport or not; then, as
    > you’ve said, the component config can be corrected, and the Knox proxy 
will
    > just work. Even for the Hive service though, some properties are likely to
    > be incorrect until the component configuration is modified (e.g.,
    > hive.server2.thrift.http.path has no value by default). So, even the
    > default in this case won’t just start working if the component
    > configuration is corrected; the topology will have to be regenerated.
    >
    > In the instructions for configuring Hive for the HTTP transport mode (
    > http://knox.apache.org/books/knox-0-13-0/user-guide.html#Hive), the
    > specified http path is “cliservice”, but could that not be any arbitrary
    > value? By default this property has no value, so we would generate
    > http://HIVESERVER2_HOST:HTTP_PORT/ ; When the aforementioned instructions
    > are followed, the actual URL will be 
http://HIVESERVER2_HOST:HTTP_PORT/cliservice,
    > and the topology will still be incorrect. If we default the path to
    > “cliservice”, and the users specifies “donttellmewhattodo”, the result is
    > the same. This is all to say that default the Hive service URL will still
    > be troublesome, but there are certainly services for which a reasonable
    > default is plausible.
    >
    > In other cases, the URL could be entirely invalid (e.g., missing config
    > properties), but a configuration change noticed by a configuration monitor
    > (i.e., KNOX-1013) could resolve that eventually.
    > For these cases, I think we’re in agreement that they can be omitted from
    > the generated topology since the source descriptor will still have the
    > declarations.
    >
    > Thanks for the insight. I think we’re close to a good compromise.
    >
    > --
    > Phil
    >
    >
    > On 9/26/17, 11:04 AM, "larry mccay" <[email protected]> wrote:
    >
    >     Hi Phil -
    >
    >     Thanks for bringing this up for discussion.
    >
    >     I do agree with the descriptor author's intent but at the same time,
    > they
    >     also intend for the others to be available.
    >     There isn't much difference between a topology with a service elements
    > that
    >     can't be reached and one without the service elements in it.
    >
    >     More than likely, when you deploy a topology and can't access a
    > service -
    >     like HIVE - you will go to ambari to check the status on the service.
    > In
    >     this case you will notice that it isn't deployed or configured
    > correctly -
    >     like in http mode. You take the actions in Ambari and the the service
    > is
    >     now accessible. Having to go and add it to the topology after that
    >     shouldn't be necessary.
    >
    >     I think that we could consider how the monitoring of the discovery
    > service
    >     is going to be driven.
    >     If it is drive by the simple descriptor - which makes sense - then I
    > think
    >     that it could result in a topology with only those services that can 
be
    >     discovered. As long as the others are still in the descriptor they can
    > be
    >     discovered later and the topology automagically get updated with the
    >     additions.
    >
    >     This gives us a situation where only "correct" topologies are deployed
    > and
    >     they will be autocorrecting as others come online.
    >     Even the HIVE situation would fix itself just by putting it in the
    > right
    >     mode.
    >
    >     My suggestion would to skip those that can't be fully discovered and
    > log
    >     each one as WARNING.
    >     Monitoring of discovery service based on descriptors rather than
    > topologies
    >     would be able to correct as appropriate.
    >
    >     What do you think?
    >
    >     thanks,
    >
    >     --larry
    >
    >
    >     On Tue, Sep 26, 2017 at 10:42 AM, Philip Zampino <
    > [email protected]>
    >     wrote:
    >
    >     > I’ve been thinking about the behavior wrt topology generation when
    > the
    >     > URL(s) for a service declared in a simple descriptor cannot be
    >     > correctly/completely determined.
    >     >
    >     > The options available include:
    >     >
    >     >
    >     >   1.  Abort the topology generation because we can’t produce what
    > has been
    >     > requested.
    >     >
    >     >   2.  Complete the topology generation without those services whose
    > URL(s)
    >     > could not be determined.
    >     > The unresolved services could be omitted or commented-out in the
    > resulting
    >     > topology file.
    >     >
    >     >   3.  Complete the topology generation, allowing the descriptor
    > deployer
    >     > the opportunity to “fill in the blanks”
    >     > This will result in Knox deploying a topology it knows to be
    > incorrect.
    >     > API deployments may not afford the deployer the opportunity to “fill
    > in
    >     > the blanks” (e.g., Ambari-driven deployments).
    >     >
    >     >
    >     > My initial feeling on this is that we should not produce anything
    > less
    >     > than what the descriptor declares (i.e.,  #1). After all, the
    > declared
    >     > services are in the descriptor precisely because someone wants to
    > access
    >     > them through Knox.
    >     >
    >     > I could possibly be persuaded that producing a partial topology
    > (i.e., #2)
    >     > may be acceptable, but it’s still not what the descriptor author
    >     > intends/requires.
    >     >
    >     > I don’t believe Knox should ever produce or deploy a topology it
    > knows to
    >     > be incorrect (i.e., #3).
    >     >
    >     > One example, which came up during the review of KNOX-1014, is HIVE;
    > If the
    >     > hiveserver2 component is not configured for HTTP transport, then
    > there is
    >     > no valid URL for that service, as far as Knox is concerned. In this
    > case, I
    >     > think we must abort the topology generation or omit the HIVE service
    > from
    >     > the generated topology.
    >     >
    >     > Interested in your thoughts…
    >     >
    >     > --
    >     > Phil
    >     >
    >     >
    >
    >
    >

Re: [DISCUSS] KNOX-1006 Topology generation behavior

Reply via email to