Re: [Openvpn-devel] Better error recognition and error/status reporting to the mgmt i/f

Selva Fri, 22 Sep 2017 09:56:40 -0700

Hi David,

Thanks for taking time to look through this and for your well-considered
response.


Given that I may not remember the details when we get back to this topic,
let me add a few remarks.

On Thu, Sep 21, 2017 at 6:00 PM, David Sommerseth <openvpn@sf.lists.
topphemmelig.net> wrote:

> On 14/09/17 21:21, Selva wrote:
> > Hi,
>
> Hi Selva,
>
> Sorry for the long wait.  Things have been quite busy since that meeting.
>
> >
> > Quoting from the meeting logs:
> >
> >
> >     Discussed having more fine-grained signalling from OpenVPN to OpenVPN
> >     GUI. The lack of clear signals from OpenVPN to OpenVPN GUI has been a
> >     rather common problem:
> >
> >
> >
> >     <https://github.com/OpenVPN/openvpn-gui/issues/168#issuecom
> ment-305243166>
> >     <https://github.com/OpenVPN/openvpn-gui/issues/9>
> >     <https://github.com/OpenVPN/openvpn-gui/issues/183>
> >
> >
> >
> >     This problem is probably not limited to OpenVPN GUI (Windows), but
> also
> >     affects other GUI's like NetworkManager. It was agreed that the best
> >     approach is that Selva, mattock and others involved on the Windows
> side
> >     come up with a PoC or "RFC" of how this issue could be tackled ...
> >     OpenVPN core will then be modified if necessary.
> >
> >
> > No time to write an "RFC" or such for something of this sort, but here
> > are some comments/suggestions:
> >
> > 0. Do not send a status message that reads "CONNECTED, SUCCESS" to the
> > management when there has been critical errors such as: failed to add
> > routes or to set ipv6 address using the service or to talk to the
> > service, etc.. Send something like "CONNECTED,ERROR".
>
> I agree that "CONNECTED, SUCCESS" is wrong if routing or other network
> configuration related details fails.  On the other side,
> "CONNECTED,ERROR" have an interesting contradiction; almost sounds like
> an oxymoron.
>

FWIW, we do currently use "CONNECTED,ERROR" in some situations.
And, of late, the Windows GUI has learned not to bandy about the green
icon in such cases.

Not to say that this is good usage language-wise... Still better than
ERROR_SUCCESS which is the "error code" for success in WIndows API.


> I'd probably prefer something like "CONNECTED,PARTIAL_SUCCESS",
> "CONNECTED,SOME_FAILURES" or something similar.  But can easily tend
> towards a discussion of the colour of the bike shed.  As long as we have
> a clear understanding and documentation of these strings, we're good.
>

Even more fine grained info like CONNECTED,ROUTE_ERRORS etc could be
useful though not essential.


>
> > A status signalling with more fine-grained info to the management would
> > be helpful, but as a short-term fix, just changing SUCCESS to ERROR in
> > some cases may be a good start.
> >
> > 1. Mark all messages to the management as I (for info), W (for warning),
> > N (for non fatal error) etc. -- on some log lines this info is currently
> > missing. I think, part of the problem is not all messages are printed
> > with an M_xx flag.
>
> This sounds like a good idea.  But we need to sync-up with the Access
> Server team, to ensure we won't break anything there.
>

I looked further into this. It seems all messages that arrive without a
flag are
some kind of INFO messages. The problem is related to the fact that M_INFO
not a pure M_xx flag (it is loglevel 1 and M_x flag=0 combined) so it cant
be
x-ored with msg flags to find whether a message is INFO or not. Also the
Mx-flag bits for INFO being 0 makes it not testable by XOR.

Based on that I think all log messages that arrive in the management
without
a flag could be safely treated as some kind of INFO. Or, we could make
sure all messages with bits 4 to 23 of the flag not set (i.e. the M_xx flag
= 0)
are tagged as 'I'. Either would do.


>
> > 2. Mark non-fatal opevpn_execve errors as M_NONFATAL instead of M_WARN
>
> This needs a careful analysis to fully understand the consequences.
> There are quite some areas of the code where openvpn_execve_check() is
> used, which is the instance calling openvpn_execve().
>

I was going after the wrong suspect here. openvpn_execve is ok as is, as
its the caller which should decide what to do with errors. It is
 openvpn_execve_check that could use M_NONFATAL instead of
current M_WARN when instructed to continue after errors.

I do not see how changing the log message flag from WARN to
NONFATAL would affect anything else other than programs that parse
the logs.


> We have 70 direct callers of openvpn_execve_check(), but also
> openvpn_run_script() is used for the script hooks.  There might be a few
> more too.  And this covers also all supported platforms, which needs to
> be thoroughly tested.
>
> In addition, openvpn_execve_check() takes a flag argument, which
> controls if errors should be M_WARN or M_FATAL.


Yes, the flag it takes can only turn M_FATAL on or off so its up to it to
decide what to do when the flag is not set. It currently decides to treat
errros as warnings in such cases. Either we should give a finer control
to callers of  openvpn_execve_check or make it print M_NONFATAL.
Again this change looks benign.

But even that may not be enough in case of route errors is the trickier
issue.


> > 3. Treat failure to talk to the service (when msg_channnel is defined)
> > as M_NONFATAL not M_WARN
>
> Sounds reasonable.


> > 4. Mark route add/del errors via service as M_NONFATAL -- currently
> M_WARN.
> >  Mark route addition failure by other means as M_NONFATAL -- currently
> > M_WARN on all platforms, and all methods, I think.

This is somewhat related to 2), as the route manipulation happens via
> openvpn_execve_check().  I do think this needs to be handled in a
> platform agnostic approach and not have one set of error behaviour on
> Windows and another one of the other platforms.
>

> That said, at this years hackathon we will discuss a more thorough
> clean-up of the route.c and tun.c code.  So this issue here is something
> to bring into that discussion as well.
>
> > 5. Mark "ifconfig" (set address) errors as consistently M_FATAL or
> > M_NONFATAL (see comment on "fatality" below).
> > Currently these are M_WARN if done by service, M_FATAL otherwise.
> >
> > Given that M_FATAL should be used only very rarely, if at all --- e.g.,
> > if proceeding further makes no sense ---  most errors should be
> > M_NONFATAL. The above comments reflect that sentiment.
>
> M_FATAL is targeting scenarios where there is no point in continuing at
> all; where OpenVPN will not be able to recover on its own.  So I do
> understand that IP address configuration failing is initially tagged as
> M_FATAL.  But this is not as easy and clear when there is another tool
> OpenVPN needs to co-operate with which will manage parts of the tunnel.
> One of them have to be the final decision maker when to bail out and
> when to not do that.
>

The current approach of treating address configuration errors as FATAL looks
proper to me. If 'ifconfig' is delegated to scripts or to manual setting by
user, the
config file will have directives to that effect. Thus openvpn knows when it
is
responsible for setting the address and bailing out when it fails to do so
is
the right thing to do.


> > That said, whether address and route addition errors should be FATAL
> > deserves some discussion -- In case of address addition, an error
> > probably has to be FATAL, but "route add" failures are borderline cases.
> > E.g., if  "--redirect-gateway" fails, the tunnel may be  considered
> > meaningless in many use cases and thus a fatal error. So, some but not
> > all route-add errors may have to be treated as FATAL.
>
> This is the crux of it.  There is no way OpenVPN by itself knows
> beforehand when a route change is fatal or not.  This touches more some
> kind of policy handling, which needs to be configurable.  Right now it
> is more the kind of "best efforts" approach on the routing setup and
> "required" for the IP address configuration.
>

For many users the tunnel is dysfunctional unless proper routes are set up.
We need to treat route errors more seriously unless it is caused by "route
already exists". Agreed, whether route errors are to be treated as FATAL or
just an error does indeed need careful consideration. But a warning is
clearly
not enough.


>
> > If there is consensus, and an appetite for patch review, I can send in
> > some patches for 2 to 5 and possibly 1. For 0, I'm not sure how to keep
> > track of past errors to construct a useful status message.
>
> I think we all agree we need to improve this.  But how and at which
> scale is currently an open topic.  Right now, I think it is good to take
> some time to discuss and debate this issue.  Perhaps we should allocate
> one community developers meeting after the hackathon for discussing
> this.  I'm suggesting after the hackathon, to ensure we have some clear
> path forward on how we want to clean up route.c/tun.c.  This is a
> massive effort and I doubt it will be done too quickly, so once the have
> some path forward we should look into the error handling as well
> instantly afterwards.


Looking forward to more discussion on this.

Thanks again,

Selva

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot

_______________________________________________
Openvpn-devel mailing list
Openvpn-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/openvpn-devel

Re: [Openvpn-devel] Better error recognition and error/status reporting to the mgmt i/f

Reply via email to