Re: [RFC 0/3] netlink: extended error reporting

2017-04-07 Thread Johannes Berg
On Fri, 2017-04-07 at 12:55 -0700, David Miller wrote:
> 
> One idea.  We use the macro thing to generate a "netlink.pot" file
> and then some userland tree can contain the latest netlink.pot and
> the translations.

Right. For the record - since we just talked about it - I was thinking
of putting it into a special section so that we could easily write a
script to generate the pot file from the kernel binary after doing
"allyesconfig" or collecting it from modules or so.

Thinking of modules though - do we need to adjust the module loading
code to load a new section? I'll have to look into that.

johannes


Re: [RFC 0/3] netlink: extended error reporting

2017-04-07 Thread David Miller
From: Johannes Berg 
Date: Fri, 07 Apr 2017 21:46:46 +0200

> On Fri, 2017-04-07 at 12:43 -0700, David Miller wrote:
>> I suspect that someone will eventually give us a hard time about the
>> strings wrt. internationalization. :-) It's solvable at least
>> partially in userspace I suppose.
> 
> I tend to think of the strings more of a debugging aid, but that's a
> good point.
> 
> Perhaps we should have a macro that has the strings - similar to the
> inline function I put into netlink - but if we make that a macro we can
> put the strings into a separate section to be able to find them more
> easily for later translation, and optionally even omit them entirely
> (to satisfy the kernel tinification folks)?

One idea.  We use the macro thing to generate a "netlink.pot" file and
then some userland tree can contain the latest netlink.pot and the
translations.

I'm suggesting it this way because whilst putting the netlink.pot file
in the kernel tree is probably OK, putting all the translation there
might not be.


Re: [RFC 0/3] netlink: extended error reporting

2017-04-07 Thread Johannes Berg
On Fri, 2017-04-07 at 21:45 +0200, Pablo Neira Ayuso wrote:
> 
> We only need to store the pointer to the attribute in the error
> container structure. We can calculate the offset from nl_err() by
> pasing the skbuff as parameter there, right?

Ah, that's a good point, we could store the nlattr pointer and
calculate the offset later. I'll need to try this, but yeah, good idea!

johannes


Re: [RFC 0/3] netlink: extended error reporting

2017-04-07 Thread Johannes Berg
On Fri, 2017-04-07 at 12:43 -0700, David Miller wrote:
> 
> I have no strong opinions about string length.
> 
> In my opinion, I would like to believe that if someone tried to get a
> networking patch applied that emitted a rediculously long string then
> we would give them feedback about how that is not acceptable.  Right?

Agree with this, I don't really see much point in adding an extra
warning. There's an implicit (multi-KB though) limit as well in the
message size ;-)

> I suspect that someone will eventually give us a hard time about the
> strings wrt. internationalization. :-) It's solvable at least
> partially in userspace I suppose.

I tend to think of the strings more of a debugging aid, but that's a
good point.

Perhaps we should have a macro that has the strings - similar to the
inline function I put into netlink - but if we make that a macro we can
put the strings into a separate section to be able to find them more
easily for later translation, and optionally even omit them entirely
(to satisfy the kernel tinification folks)?

johannes


Re: [RFC 0/3] netlink: extended error reporting

2017-04-07 Thread Pablo Neira Ayuso
On Fri, Apr 07, 2017 at 09:29:17PM +0200, Johannes Berg wrote:
> On Fri, 2017-04-07 at 21:21 +0200, Pablo Neira Ayuso wrote:
> > I think the most flexible way is to pass the container error
> > structure to nla_parse() so it sets this for you. This would also
> > save tons of "malformed attribute" strings.
> 
> Yes, for sure. The thing is we'll probalby have to pass down the
> request skb *and* the error struct so that we can get the offset, and
> this seems like the generic thing that we really should try to get the
> most information generated.

We only need to store the pointer to the attribute in the error
container structure. We can calculate the offset from nl_err() by
pasing the skbuff as parameter there, right?


Re: [RFC 0/3] netlink: extended error reporting

2017-04-07 Thread David Miller
From: Pablo Neira Ayuso 
Date: Fri, 7 Apr 2017 21:35:50 +0200

> On Fri, Apr 07, 2017 at 12:20:53PM -0700, David Miller wrote:
> [...]
>> Let's just discuss the UAPI, since people complain we don't talk
>> about that enough :-)  For those playing at home it is three new
>> attributes returned in a netlink ACK when the application asks
>> for the extended response:
>> 
>>  NLMSGERR_ATTR_MSG   string  Extended error string
>>  NLMSGERR_ATTR_OFFS  u32 Byte offset to netlink element causing 
>> error
>>  NLMSGERR_ATTR_CODE  u32 Subsystem specific error code
>>  NLMSGERR_ATTR_ATTR  u16 Netlink attribute triggering error or 
>> missing
> 
> I think it would be good if we get a definition to cap the maximum
> string length to something reasonable? This can be added in a follow
> up patch BTW. Thus, we get people coming back to us and request larger
> strings with a reason why they need more room for this.
> 
> In general, my main concern with strings is that they can be used in a
> more freely way than these u32 offsets and error codes, and
> specifically how inconsistent these string will look like across
> different netlink subsystems.
> 
> Anyway, as long as this is optional (not every subsystem if forced to
> use strings) I'm fine with it :).

I have no strong opinions about string length.

In my opinion, I would like to believe that if someone tried to get a
networking patch applied that emitted a rediculously long string then
we would give them feedback about how that is not acceptable.  Right?

I suspect that someone will eventually give us a hard time about the
strings wrt. internationalization. :-) It's solvable at least
partially in userspace I suppose.



Re: [RFC 0/3] netlink: extended error reporting

2017-04-07 Thread Pablo Neira Ayuso
On Fri, Apr 07, 2017 at 12:20:53PM -0700, David Miller wrote:
[...]
> Let's just discuss the UAPI, since people complain we don't talk
> about that enough :-)  For those playing at home it is three new
> attributes returned in a netlink ACK when the application asks
> for the extended response:
> 
>   NLMSGERR_ATTR_MSG   string  Extended error string
>   NLMSGERR_ATTR_OFFS  u32 Byte offset to netlink element causing 
> error
>   NLMSGERR_ATTR_CODE  u32 Subsystem specific error code
>   NLMSGERR_ATTR_ATTR  u16 Netlink attribute triggering error or 
> missing

I think it would be good if we get a definition to cap the maximum
string length to something reasonable? This can be added in a follow
up patch BTW. Thus, we get people coming back to us and request larger
strings with a reason why they need more room for this.

In general, my main concern with strings is that they can be used in a
more freely way than these u32 offsets and error codes, and
specifically how inconsistent these string will look like across
different netlink subsystems.

Anyway, as long as this is optional (not every subsystem if forced to
use strings) I'm fine with it :).


Re: [RFC 0/3] netlink: extended error reporting

2017-04-07 Thread David Miller
From: Pablo Neira Ayuso 
Date: Fri, 7 Apr 2017 21:27:14 +0200

> On Fri, Apr 07, 2017 at 12:22:23PM -0700, David Miller wrote:
>> From: Johannes Berg 
>> Date: Fri, 07 Apr 2017 21:09:45 +0200
>> 
>> > On Fri, 2017-04-07 at 21:06 +0200, Pablo Neira Ayuso wrote:
>> >> On Fri, Apr 07, 2017 at 08:59:12PM +0200, Johannes Berg wrote:
>> >> [...]
>> >> > Heh. I think I really want to solve - at least partially -
>> >> > nla_parse()
>> >> > to see that it can be done this way. It'd be nice to even transform
>> >> > all
>> >> > the callers (I generated half of these patches with spatch anyway)
>> >> > to
>> >> > have at least that.
>> >> 
>> >> We can just have a modified version of nla_parse that deals with
>> >> this.
>> > 
>> > Yes, but we need to figure out a good way to have the offset.
>> > 
>> > We also need to see if we want to *force* having the offset. In some
>> > sense that'd be useful, in another it might be very complicated to fill
>> > it in at all times, if for example errors come from lower layers like
>> > drivers.
>> 
>> It has to be optional, some kinds of errors don't have an exact
>> context per-se.
>> 
>> Also another way to look at this is that we're providing a lot of
>> new power and expressability.  So even if only one aspect of the
>> new error reporting is used it's a positive step forward.
>> 
>> So allow offset "0" meaning "unspecified".
> 
> Instead, we can just not send the offset attribute to userspace if
> it's not specified. So missing attribute means "unspecified".

Agreed, not providing the attribute to indicate this is fine.


Re: [RFC 0/3] netlink: extended error reporting

2017-04-07 Thread David Miller
From: Pablo Neira Ayuso 
Date: Fri, 7 Apr 2017 21:21:34 +0200

> For my usecases in netfilter, the attributes and an specific error
> code should be enough to figure out what is wrong. Will not need
> strings.
> 
> BTW, we may not have an offset, eg. EINVAL because of missing
> attribute. Given we have different requirements, I would leave it to
> each subsystem to decide what netlink error attributes are specified.

Yep, completely agreed.

The use cases for offset and missing attribute I see as follows:

1) Top-level attribute is missing. Here, offset is set to zero
   and the missing attribute number is given as well.

2) Nested attribute is missing.  Offset is set to the location of the
   beginning of nesting, inside of which the missing attribute was
   needed.


Re: [RFC 0/3] netlink: extended error reporting

2017-04-07 Thread Johannes Berg
On Fri, 2017-04-07 at 21:27 +0200, Pablo Neira Ayuso wrote:
> 
> > Also another way to look at this is that we're providing a lot of
> > new power and expressability.  So even if only one aspect of the
> > new error reporting is used it's a positive step forward.

True.

> > So allow offset "0" meaning "unspecified".
> 
> Instead, we can just not send the offset attribute to userspace if
> it's not specified. So missing attribute means "unspecified".
> 
> I'm always a bit worried this "0 means something" semantics :)

Yeah, I have that. If it's 0 internally the attribute will be omitted.

johannes


Re: [RFC 0/3] netlink: extended error reporting

2017-04-07 Thread Johannes Berg
On Fri, 2017-04-07 at 21:21 +0200, Pablo Neira Ayuso wrote:
> 
> For my usecases in netfilter, the attributes and an specific error
> code should be enough to figure out what is wrong. Will not need
> strings.

Fair enough.

> BTW, we may not have an offset, eg. EINVAL because of missing
> attribute. Given we have different requirements, I would leave it to
> each subsystem to decide what netlink error attributes are specified.
> 
> > (It's ultimately always going to be optional since we'll continue
> > returning errors without *any* extended error information - likely
> > indefinitely - but if we have a wrong attribute, should we always
> > have
> > an offset? Would be nice, but could be difficult in practice)
> > 
> > > We can probably use struct nla_policy to place the extended error
> > > code or the string (as we discussed I don't need string errors,
> > > but
> > > I'm fine if other people find it useful).
> > 
> > I don't think for the error strings really would be useful for
> > nla_parse() or a policy - we can return something generic like
> > "malformed attribute" there as a string, and hopefully point to the
> > attribute/offset from there generically. I just really want to see
> > how
> > to actually do that.
> 
> I think the most flexible way is to pass the container error
> structure to nla_parse() so it sets this for you. This would also
> save tons of "malformed attribute" strings.

Yes, for sure. The thing is we'll probalby have to pass down the
request skb *and* the error struct so that we can get the offset, and
this seems like the generic thing that we really should try to get the
most information generated.

johannes


Re: [RFC 0/3] netlink: extended error reporting

2017-04-07 Thread Pablo Neira Ayuso
On Fri, Apr 07, 2017 at 12:22:23PM -0700, David Miller wrote:
> From: Johannes Berg 
> Date: Fri, 07 Apr 2017 21:09:45 +0200
> 
> > On Fri, 2017-04-07 at 21:06 +0200, Pablo Neira Ayuso wrote:
> >> On Fri, Apr 07, 2017 at 08:59:12PM +0200, Johannes Berg wrote:
> >> [...]
> >> > Heh. I think I really want to solve - at least partially -
> >> > nla_parse()
> >> > to see that it can be done this way. It'd be nice to even transform
> >> > all
> >> > the callers (I generated half of these patches with spatch anyway)
> >> > to
> >> > have at least that.
> >> 
> >> We can just have a modified version of nla_parse that deals with
> >> this.
> > 
> > Yes, but we need to figure out a good way to have the offset.
> > 
> > We also need to see if we want to *force* having the offset. In some
> > sense that'd be useful, in another it might be very complicated to fill
> > it in at all times, if for example errors come from lower layers like
> > drivers.
> 
> It has to be optional, some kinds of errors don't have an exact
> context per-se.
> 
> Also another way to look at this is that we're providing a lot of
> new power and expressability.  So even if only one aspect of the
> new error reporting is used it's a positive step forward.
> 
> So allow offset "0" meaning "unspecified".

Instead, we can just not send the offset attribute to userspace if
it's not specified. So missing attribute means "unspecified".

I'm always a bit worried this "0 means something" semantics :)


Re: [RFC 0/3] netlink: extended error reporting

2017-04-07 Thread Johannes Berg
On Fri, 2017-04-07 at 12:20 -0700, David Miller wrote:

> But what it lacks fundamentally is context.  Therefore it can't be
> used to provide the offset or the bad attribute number.  So it can't
> meet our requirements.

Yes, it doesn't address the requirements here, and in a sense I suspect
this will be true anywhere else too - perhaps that's the reason why it
was never applied even over in perf. I've not gotten a good answer for
that question yet :)

> We've passed down new arguments into core method call chains for much
> less useful facilitites than this.  So that doesn't bother me.  And
> as that is a kernel internal matter, we can refine it later.

Yes, both are true.

> Wrt. nla_parse(), we can solve that problem as follow-on patches too.

Yes. I just want to be sure we can solve it, but OTOH we're not locking
down anything but the UAPI so whatever way we find to solve it, even if
it requires a complete rearchitecture internally, it still doesn't
matter.

> So I consider this change ready as far as the implementation is
> concerned.

Fair enough. I still do need to test it though :)

> Ok, Jiri, start reading.  I will try to make you happy here.
> 
> Let's just discuss the UAPI, since people complain we don't talk
> about that enough :-)  For those playing at home it is three new
> attributes returned in a netlink ACK when the application asks
> for the extended response:
> 
> NLMSGERR_ATTR_MSG   string  Extended error string
> NLMSGERR_ATTR_OFFS  u32 Byte offset to netlink
> element causing error
> NLMSGERR_ATTR_CODE  u32 Subsystem specific error code
> NLMSGERR_ATTR_ATTR  u16 Netlink attribute triggering
> error or missing

That's four ;-)

Since you're carefully listing it here, you should also say that

 * the new behaviour is entirely optional, enabled by a new netlink
   socket option
 * the ACK message format ends up being
   [nlmsg header]
   [ack header including errno]
   [request message nlmsg hdr]
   [request message payload - optional, controlled by socket option,
aligned to 4 byte length if present]
   [new TLVs - optional, controlled by socket option]

johannes


Re: [RFC 0/3] netlink: extended error reporting

2017-04-07 Thread David Miller
From: Johannes Berg 
Date: Fri, 07 Apr 2017 21:09:45 +0200

> On Fri, 2017-04-07 at 21:06 +0200, Pablo Neira Ayuso wrote:
>> On Fri, Apr 07, 2017 at 08:59:12PM +0200, Johannes Berg wrote:
>> [...]
>> > Heh. I think I really want to solve - at least partially -
>> > nla_parse()
>> > to see that it can be done this way. It'd be nice to even transform
>> > all
>> > the callers (I generated half of these patches with spatch anyway)
>> > to
>> > have at least that.
>> 
>> We can just have a modified version of nla_parse that deals with
>> this.
> 
> Yes, but we need to figure out a good way to have the offset.
> 
> We also need to see if we want to *force* having the offset. In some
> sense that'd be useful, in another it might be very complicated to fill
> it in at all times, if for example errors come from lower layers like
> drivers.

It has to be optional, some kinds of errors don't have an exact
context per-se.

Also another way to look at this is that we're providing a lot of
new power and expressability.  So even if only one aspect of the
new error reporting is used it's a positive step forward.

So allow offset "0" meaning "unspecified".


Re: [RFC 0/3] netlink: extended error reporting

2017-04-07 Thread David Miller
From: Johannes Berg 
Date: Fri, 07 Apr 2017 20:59:12 +0200

> Alexander Shishkin's patch
> https://patchwork.kernel.org/patch/7162421/
> 
> was nice in a way because you could get away without passing the error
> structure down all the time, but like I said it doesn't deal with
> dynamic errors (even the offset/attr from nla_parse) and if it's not
> done *everywhere* it risks leaking those exterr numbers (-1024..-4095)
> to userspace through syscalls if it's used in a non-netlink path for
> some reason (e.g. a single driver call being called through both
> netlink and ioctl, and trying to return an extended error)

Alexander's patch is cute in that it annotates errors using the special
section blobs.

But what it lacks fundamentally is context.  Therefore it can't be
used to provide the offset or the bad attribute number.  So it can't
meet our requirements.

We've passed down new arguments into core method call chains for much
less useful facilitites than this.  So that doesn't bother me.  And
as that is a kernel internal matter, we can refine it later.

Wrt. nla_parse(), we can solve that problem as follow-on patches too.

So I consider this change ready as far as the implementation is
concerned.

Ok, Jiri, start reading.  I will try to make you happy here.

Let's just discuss the UAPI, since people complain we don't talk
about that enough :-)  For those playing at home it is three new
attributes returned in a netlink ACK when the application asks
for the extended response:

NLMSGERR_ATTR_MSG   string  Extended error string
NLMSGERR_ATTR_OFFS  u32 Byte offset to netlink element causing 
error
NLMSGERR_ATTR_CODE  u32 Subsystem specific error code
NLMSGERR_ATTR_ATTR  u16 Netlink attribute triggering error or 
missing

So please consider this interface carefully.


Re: [RFC 0/3] netlink: extended error reporting

2017-04-07 Thread Pablo Neira Ayuso
On Fri, Apr 07, 2017 at 09:09:45PM +0200, Johannes Berg wrote:
> On Fri, 2017-04-07 at 21:06 +0200, Pablo Neira Ayuso wrote:
> > On Fri, Apr 07, 2017 at 08:59:12PM +0200, Johannes Berg wrote:
> > [...]
> > > Heh. I think I really want to solve - at least partially -
> > > nla_parse()
> > > to see that it can be done this way. It'd be nice to even transform
> > > all
> > > the callers (I generated half of these patches with spatch anyway)
> > > to
> > > have at least that.
> > 
> > We can just have a modified version of nla_parse that deals with
> > this.
> 
> Yes, but we need to figure out a good way to have the offset.
> 
> We also need to see if we want to *force* having the offset. In some
> sense that'd be useful, in another it might be very complicated to fill
> it in at all times, if for example errors come from lower layers like
> drivers.

For my usecases in netfilter, the attributes and an specific error
code should be enough to figure out what is wrong. Will not need
strings.

BTW, we may not have an offset, eg. EINVAL because of missing
attribute. Given we have different requirements, I would leave it to
each subsystem to decide what netlink error attributes are specified.

> (It's ultimately always going to be optional since we'll continue
> returning errors without *any* extended error information - likely
> indefinitely - but if we have a wrong attribute, should we always have
> an offset? Would be nice, but could be difficult in practice)
> 
> > We can probably use struct nla_policy to place the extended error
> > code or the string (as we discussed I don't need string errors, but
> > I'm fine if other people find it useful).
> 
> I don't think for the error strings really would be useful for
> nla_parse() or a policy - we can return something generic like
> "malformed attribute" there as a string, and hopefully point to the
> attribute/offset from there generically. I just really want to see how
> to actually do that.

I think the most flexible way is to pass the container error structure
to nla_parse() so it sets this for you. This would also save tons of
"malformed attribute" strings.


Re: [RFC 0/3] netlink: extended error reporting

2017-04-07 Thread Johannes Berg
On Fri, 2017-04-07 at 21:06 +0200, Pablo Neira Ayuso wrote:
> On Fri, Apr 07, 2017 at 08:59:12PM +0200, Johannes Berg wrote:
> [...]
> > Heh. I think I really want to solve - at least partially -
> > nla_parse()
> > to see that it can be done this way. It'd be nice to even transform
> > all
> > the callers (I generated half of these patches with spatch anyway)
> > to
> > have at least that.
> 
> We can just have a modified version of nla_parse that deals with
> this.

Yes, but we need to figure out a good way to have the offset.

We also need to see if we want to *force* having the offset. In some
sense that'd be useful, in another it might be very complicated to fill
it in at all times, if for example errors come from lower layers like
drivers.

(It's ultimately always going to be optional since we'll continue
returning errors without *any* extended error information - likely
indefinitely - but if we have a wrong attribute, should we always have
an offset? Would be nice, but could be difficult in practice)

> We can probably use struct nla_policy to place the extended error
> code or the string (as we discussed I don't need string errors, but
> I'm fine if other people find it useful).

I don't think for the error strings really would be useful for
nla_parse() or a policy - we can return something generic like
"malformed attribute" there as a string, and hopefully point to the
attribute/offset from there generically. I just really want to see how
to actually do that.

johannes


Re: [RFC 0/3] netlink: extended error reporting

2017-04-07 Thread Johannes Berg
On Fri, 2017-04-07 at 21:02 +0200, Pablo Neira Ayuso wrote:
> 
> > I'm tempted to apply this as-is just to show that person that
> > things do in fact happen eventually :-)
> 
> We can just send follow up patches to refine, I think it's a good
> start, Johannes?

I guess we can. Like I just said though - I do have a plan to make
nla_parse() work but the week's been exhausting enough that I don't
feel entirely comfortable saying that will definitely work.

OTOH, arguably this is something that seems to work - as evidenced by
the nl80211 changes - for at least some error reporting, so I guess we
can continue to refine it, and the only thing we're really locking in
to is the ABI, which is TLVs now, so that should be safe.

That said, the nl80211 patch really should be bigger, and I've not even
tested this at all yet, so I really need to do that first.

Also, I'd like to point to the genl_info change, does that seem like a
reasonable way to pass it around in genetlink? I'm much less familiar
with the "regular" netlink to say how to pass it around there beyond
what I already did.

johannes


Re: [RFC 0/3] netlink: extended error reporting

2017-04-07 Thread Pablo Neira Ayuso
On Fri, Apr 07, 2017 at 08:59:12PM +0200, Johannes Berg wrote:
[...]
> Heh. I think I really want to solve - at least partially - nla_parse()
> to see that it can be done this way. It'd be nice to even transform all
> the callers (I generated half of these patches with spatch anyway) to
> have at least that.

We can just have a modified version of nla_parse that deals with this.
We can probably use struct nla_policy to place the extended error code
or the string (as we discussed I don't need string errors, but I'm
fine if other people find it useful).

Thanks!


Re: [RFC 0/3] netlink: extended error reporting

2017-04-07 Thread Pablo Neira Ayuso
On Fri, Apr 07, 2017 at 11:53:15AM -0700, David Miller wrote:
> From: Johannes Berg 
> Date: Fri,  7 Apr 2017 20:26:17 +0200
> 
> > So this is my first draft of what we'd talked about at netconf.
> > I'm not super happy with the way we have to pass the extended
> > error struct, but I don't see a way to implement reporting any
> > dynamic information (like error offsets) in any other way.
> > 
> > Alexander Shishkin had a nice way of reporting static extended
> > error data, but that isn't really suitable for reporting the
> > offset or even reporting the broken attribute from nla_parse().
> > 
> > Speaking of nla_parse(), that'll be somewhat complicated to do
> > since we'll have to track the offsets of where we're parsing,
> > but it might be possible since the nlattrs are just pointers
> > into the message, so (optionally?) passing the skb as well can
> > allow us to fill the offset information.
> 
> I like it, nice work.
> 
> I know people want dynamically generated strings and stuff, and we can
> get there, but I prefer that the first thing we commit is super simple.
> 
> Someone gave me a hard time about the fact that we've been talking
> about this idea for years but nothing ever happens.
> 
> I'm tempted to apply this as-is just to show that person that things
> do in fact happen eventually :-)

We can just send follow up patches to refine, I think it's a good
start, Johannes?

BTW, for this co-authored effort in designing this:

Signed-off-by: Pablo Neira Ayuso 

Thanks!


Re: [RFC 0/3] netlink: extended error reporting

2017-04-07 Thread Johannes Berg
On Fri, 2017-04-07 at 11:53 -0700, David Miller wrote:

> > Alexander Shishkin had a nice way of reporting static extended
> > error data, but that isn't really suitable for reporting the
> > offset or even reporting the broken attribute from nla_parse().
> > 
> > Speaking of nla_parse(), that'll be somewhat complicated to do
> > since we'll have to track the offsets of where we're parsing,
> > but it might be possible since the nlattrs are just pointers
> > into the message, so (optionally?) passing the skb as well can
> > allow us to fill the offset information.
> 
> I like it, nice work.

I don't like it all that much ;-)

Alexander Shishkin's patch
https://patchwork.kernel.org/patch/7162421/

was nice in a way because you could get away without passing the error
structure down all the time, but like I said it doesn't deal with
dynamic errors (even the offset/attr from nla_parse) and if it's not
done *everywhere* it risks leaking those exterr numbers (-1024..-4095)
to userspace through syscalls if it's used in a non-netlink path for
some reason (e.g. a single driver call being called through both
netlink and ioctl, and trying to return an extended error)

> I know people want dynamically generated strings and stuff, and we
> can get there, but I prefer that the first thing we commit is super
> simple.
> 
> Someone gave me a hard time about the fact that we've been talking
> about this idea for years but nothing ever happens.
> 
> I'm tempted to apply this as-is just to show that person that things
> do in fact happen eventually :-)

Heh. I think I really want to solve - at least partially - nla_parse()
to see that it can be done this way. It'd be nice to even transform all
the callers (I generated half of these patches with spatch anyway) to
have at least that.

johannes


Re: [RFC 0/3] netlink: extended error reporting

2017-04-07 Thread David Miller
From: Johannes Berg 
Date: Fri,  7 Apr 2017 20:26:17 +0200

> So this is my first draft of what we'd talked about at netconf.
> I'm not super happy with the way we have to pass the extended
> error struct, but I don't see a way to implement reporting any
> dynamic information (like error offsets) in any other way.
> 
> Alexander Shishkin had a nice way of reporting static extended
> error data, but that isn't really suitable for reporting the
> offset or even reporting the broken attribute from nla_parse().
> 
> Speaking of nla_parse(), that'll be somewhat complicated to do
> since we'll have to track the offsets of where we're parsing,
> but it might be possible since the nlattrs are just pointers
> into the message, so (optionally?) passing the skb as well can
> allow us to fill the offset information.

I like it, nice work.

I know people want dynamically generated strings and stuff, and we can
get there, but I prefer that the first thing we commit is super simple.

Someone gave me a hard time about the fact that we've been talking
about this idea for years but nothing ever happens.

I'm tempted to apply this as-is just to show that person that things
do in fact happen eventually :-)


[RFC 0/3] netlink: extended error reporting

2017-04-07 Thread Johannes Berg
So this is my first draft of what we'd talked about at netconf.
I'm not super happy with the way we have to pass the extended
error struct, but I don't see a way to implement reporting any
dynamic information (like error offsets) in any other way.

Alexander Shishkin had a nice way of reporting static extended
error data, but that isn't really suitable for reporting the
offset or even reporting the broken attribute from nla_parse().

Speaking of nla_parse(), that'll be somewhat complicated to do
since we'll have to track the offsets of where we're parsing,
but it might be possible since the nlattrs are just pointers
into the message, so (optionally?) passing the skb as well can
allow us to fill the offset information.

johannes