Re: [vpp-dev] MACIP ACL replace causes ip4_table_index change?

2017-12-11 Thread Dave Barach (dbarach)
Folks will miss a clib_warning, unless they check syslog. 

I'd consider returning VNET_API_ERROR_ENTRY_ALREADY_EXISTS and calling it a day.

Thanks… Dave

-Original Message-
From: Andrew 👽 Yourtchenko [mailto:ayour...@gmail.com] 
Sent: Sunday, December 10, 2017 9:04 AM
To: Dave Barach (dbarach) 
Cc: Jon Loeliger ; vpp-dev 
Subject: Re: [vpp-dev] MACIP ACL replace causes ip4_table_index change?

Dear Dave,

On 12/9/17, Dave Barach (dbarach)  wrote:
> This looks wrong... vnet_set_input_acl_intfc(...) at line 93:
>
>   /* Return ok on ADD operation if feature is already enabled */
>   if (is_add &&
>am->classify_table_index_by_sw_if_index[ti][sw_if_index] != ~0)
>  return 0;
>
> It’s been that way for a very long time.

Yeah so I am wondering what's the right approach on fixing it, I see
three alternatives:

1) "set" the new inacl even if there is an existing one applied..
upside: consistent with what "set" means in layman's terms; downside:
bigger change vs. the existing semantics which maybe is masking some
other issues.

2) return an error rather than zero, and let the callers deal with
this. upside: no big change of the semantics. downside: returning an
error might upset some callers that were "accidentally" relying on
this behaviour.

3) stick in a "clib_warning()" saying "This will soon return an error.
The calling code needs to ensure this is handled correctly", and wait
for one or two releases, and have a JIRA for the next release to
*then* do (2) in the next release.

If this behavior has been here sufficiently long, (3) seems like a
safest action..

What do you think ?

--a


>
> Thanks… Dave
>
> From: vpp-dev-boun...@lists.fd.io [mailto:vpp-dev-boun...@lists.fd.io] On
> Behalf Of Jon Loeliger
> Sent: Saturday, December 9, 2017 11:23 AM
> To: Andrew 👽 Yourtchenko 
> Cc: vpp-dev 
> Subject: Re: [vpp-dev] MACIP ACL replace causes ip4_table_index change?
>
> On Sat, Dec 9, 2017 at 8:16 AM, Andrew 👽 Yourtchenko
> mailto:ayour...@gmail.com>> wrote:
> Jon,
>
> Hi Andrew,
>
> Thanks for taking a look at this issue!
>
> on api trace: does the below work ? (even though the current scenario
> is trivially reproducible, the api traces are very useful for tougher
> cases, and save a lot of typing while storytelling).
>
> DBGvpp# api trace on
>
> . do the things 
>
> DBGvpp# api trace save macip-trace
> API trace saved to /tmp/macip-trace
> DBGvpp# api trace custom-dump /tmp/macip-trace
> SCRIPT: macip_acl_add_replace -1 count 1 count 1 \
>   ipv4 permit \
> src mac 00:00:00:00:00:00 mask 00:00:00:00:00:00 \
> src ip 0.0.0.0/0<http://0.0.0.0/0>, \
>
> SCRIPT: macip_acl_interface_add_del sw_if_index 0 acl_index 0 add
> SCRIPT: macip_acl_add_replace 0 count 1 count 1 \
>   ipv4 permit \
> src mac 00:00:00:00:00:00 mask 00:00:00:00:00:00 \
> src ip 0.0.0.0/0<http://0.0.0.0/0>, \
>
> I think that is the right sequence.
>
>
> Now, to the issue itself: it's exactly as I described, but with a twist:
> vnet_set_input_acl_intfc(), which is used under the hood to assign the
> inacl on the interfaces, is quite picky - if there is an existing
> inacl applied, it just quietly does nothing. (@DaveB - this kinda
> feels strange, I am not sure what the logic is behind doing that.)
>
> Anyway, rather than debating on why it behaves this way, and,
> especially since we actually are deleting the tables in question, it's
> better to unapply the inacls first, and then reapply them after the
> tables have been recreated.
>
> This solves half the problem for me.  It looks like I can properly
> turn around and remove this ACL from the interface now!
>
> But I still have doubts; or at least I don't understand why the
> three table indices are 3 after initial creation, and 0 after they
> are replaced.
>
> The result is in https://gerrit.fd.io/r/#/c/9772/ - you can verify
> that it addresses the issue.
>
> I've left a comment on the code there.
>
> Despite what Gerrit thinks, this code does compile and run for me!
> So maybe just a "rebuild" request there will allow it to verify?
>
>
> Now, going on to "how exactly did this slip through" - seems the macip
> tests are quite a bit too lenient than they should be. I'll need to
> address that as well, though probably I will split the dot1q/dot1ad
> test cases out, and in the process refactor things a bit... so in the
> interests of your time, maybe 9772 can go with just an actual code
> fix.
>
> I've not read through the test as they stand today.
> I'd like to understand the "3 vs. 0" issue before I am happy to
> Code Re

Re: [vpp-dev] MACIP ACL replace causes ip4_table_index change?

2017-12-10 Thread Andrew 👽 Yourtchenko
Dear Dave,

On 12/9/17, Dave Barach (dbarach)  wrote:
> This looks wrong... vnet_set_input_acl_intfc(...) at line 93:
>
>   /* Return ok on ADD operation if feature is already enabled */
>   if (is_add &&
>am->classify_table_index_by_sw_if_index[ti][sw_if_index] != ~0)
>  return 0;
>
> It’s been that way for a very long time.

Yeah so I am wondering what's the right approach on fixing it, I see
three alternatives:

1) "set" the new inacl even if there is an existing one applied..
upside: consistent with what "set" means in layman's terms; downside:
bigger change vs. the existing semantics which maybe is masking some
other issues.

2) return an error rather than zero, and let the callers deal with
this. upside: no big change of the semantics. downside: returning an
error might upset some callers that were "accidentally" relying on
this behaviour.

3) stick in a "clib_warning()" saying "This will soon return an error.
The calling code needs to ensure this is handled correctly", and wait
for one or two releases, and have a JIRA for the next release to
*then* do (2) in the next release.

If this behavior has been here sufficiently long, (3) seems like a
safest action..

What do you think ?

--a


>
> Thanks… Dave
>
> From: vpp-dev-boun...@lists.fd.io [mailto:vpp-dev-boun...@lists.fd.io] On
> Behalf Of Jon Loeliger
> Sent: Saturday, December 9, 2017 11:23 AM
> To: Andrew 👽 Yourtchenko 
> Cc: vpp-dev 
> Subject: Re: [vpp-dev] MACIP ACL replace causes ip4_table_index change?
>
> On Sat, Dec 9, 2017 at 8:16 AM, Andrew 👽 Yourtchenko
> mailto:ayour...@gmail.com>> wrote:
> Jon,
>
> Hi Andrew,
>
> Thanks for taking a look at this issue!
>
> on api trace: does the below work ? (even though the current scenario
> is trivially reproducible, the api traces are very useful for tougher
> cases, and save a lot of typing while storytelling).
>
> DBGvpp# api trace on
>
> . do the things 
>
> DBGvpp# api trace save macip-trace
> API trace saved to /tmp/macip-trace
> DBGvpp# api trace custom-dump /tmp/macip-trace
> SCRIPT: macip_acl_add_replace -1 count 1 count 1 \
>   ipv4 permit \
> src mac 00:00:00:00:00:00 mask 00:00:00:00:00:00 \
> src ip 0.0.0.0/0<http://0.0.0.0/0>, \
>
> SCRIPT: macip_acl_interface_add_del sw_if_index 0 acl_index 0 add
> SCRIPT: macip_acl_add_replace 0 count 1 count 1 \
>   ipv4 permit \
> src mac 00:00:00:00:00:00 mask 00:00:00:00:00:00 \
> src ip 0.0.0.0/0<http://0.0.0.0/0>, \
>
> I think that is the right sequence.
>
>
> Now, to the issue itself: it's exactly as I described, but with a twist:
> vnet_set_input_acl_intfc(), which is used under the hood to assign the
> inacl on the interfaces, is quite picky - if there is an existing
> inacl applied, it just quietly does nothing. (@DaveB - this kinda
> feels strange, I am not sure what the logic is behind doing that.)
>
> Anyway, rather than debating on why it behaves this way, and,
> especially since we actually are deleting the tables in question, it's
> better to unapply the inacls first, and then reapply them after the
> tables have been recreated.
>
> This solves half the problem for me.  It looks like I can properly
> turn around and remove this ACL from the interface now!
>
> But I still have doubts; or at least I don't understand why the
> three table indices are 3 after initial creation, and 0 after they
> are replaced.
>
> The result is in https://gerrit.fd.io/r/#/c/9772/ - you can verify
> that it addresses the issue.
>
> I've left a comment on the code there.
>
> Despite what Gerrit thinks, this code does compile and run for me!
> So maybe just a "rebuild" request there will allow it to verify?
>
>
> Now, going on to "how exactly did this slip through" - seems the macip
> tests are quite a bit too lenient than they should be. I'll need to
> address that as well, though probably I will split the dot1q/dot1ad
> test cases out, and in the process refactor things a bit... so in the
> interests of your time, maybe 9772 can go with just an actual code
> fix.
>
> I've not read through the test as they stand today.
> I'd like to understand the "3 vs. 0" issue before I am happy to
> Code Review +1 this patch.
>
> I've dumped the entire debugging process into a log, which turned out
> be fairly long, so to avoid sending the walls of text to the list,
> I've dumped it elsewhere:
> http://stdio.be/blog/2017-12-09-Debugging-VPP-MACIP-ACLs/
>
> And, excellent!  I read through that in quite some detail.  And I
> understand
> the "3 vs 0" issue I was seeing

Re: [vpp-dev] MACIP ACL replace causes ip4_table_index change?

2017-12-10 Thread Andrew 👽 Yourtchenko
Hi Jon,

On 12/9/17, Jon Loeliger  wrote:
> On Sat, Dec 9, 2017 at 8:16 AM, Andrew 👽 Yourtchenko 
> wrote:
>
>> Jon,
>>
>
> Hi Andrew,
>
> Thanks for taking a look at this issue!
>
>
>> on api trace: does the below work ? (even though the current scenario
>> is trivially reproducible, the api traces are very useful for tougher
>> cases, and save a lot of typing while storytelling).
>>
>> DBGvpp# api trace on
>>
>> . do the things 
>>
>> DBGvpp# api trace save macip-trace
>> API trace saved to /tmp/macip-trace
>> DBGvpp# api trace custom-dump /tmp/macip-trace
>> SCRIPT: macip_acl_add_replace -1 count 1 count 1 \
>>   ipv4 permit \
>> src mac 00:00:00:00:00:00 mask 00:00:00:00:00:00 \
>> src ip 0.0.0.0/0, \
>>
>> SCRIPT: macip_acl_interface_add_del sw_if_index 0 acl_index 0 add
>> SCRIPT: macip_acl_add_replace 0 count 1 count 1 \
>>   ipv4 permit \
>> src mac 00:00:00:00:00:00 mask 00:00:00:00:00:00 \
>> src ip 0.0.0.0/0, \
>>
>
> I think that is the right sequence.
>
>
> Now, to the issue itself: it's exactly as I described, but with a twist:
>> vnet_set_input_acl_intfc(), which is used under the hood to assign the
>> inacl on the interfaces, is quite picky - if there is an existing
>> inacl applied, it just quietly does nothing. (@DaveB - this kinda
>> feels strange, I am not sure what the logic is behind doing that.)
>>
>> Anyway, rather than debating on why it behaves this way, and,
>> especially since we actually are deleting the tables in question, it's
>> better to unapply the inacls first, and then reapply them after the
>> tables have been recreated.
>>
>
> This solves half the problem for me.  It looks like I can properly
> turn around and remove this ACL from the interface now!
>
> But I still have doubts; or at least I don't understand why the
> three table indices are 3 after initial creation, and 0 after they
> are replaced.
>
> The result is in https://gerrit.fd.io/r/#/c/9772/ - you can verify
>> that it addresses the issue.
>>
>
> I've left a comment on the code there.

if by "the issue" in the comment you mean the index being first zero
and then three,
this is not really a problem - like I wrote in the log... Anyway,
since I was also curious
to know a bit more about "why", let me expand on this here:
if I add this diff to the code:

diff --git a/src/plugins/acl/acl.c b/src/plugins/acl/acl.c
index 286b2e96..5b4b1828 100644
--- a/src/plugins/acl/acl.c
+++ b/src/plugins/acl/acl.c
@@ -476,6 +476,12 @@ acl_classify_add_del_table_small
(vnet_classify_main_t * cm, u8 * mask,
 table_index, current_data_flag,
 current_data_offset, is_add,
 1 /* delete_chain */ );
+  if (table_index) {
+if (is_add)
+  clib_warning ("CLASSIFY-SMALL-TABLE: New table index: %d for
mask %U\n", *table_index, format_hex_bytes, mask, mask_len);
+else
+  clib_warning ("CLASSIFY-SMALL-TABLE: deleting index %d", *table_index);
+  }
   clib_mem_set_heap (oldheap);
   return ret;
 }

I notice that upon the first creation the indices that we get are 0,
1, 2, 3, with the "3" being the last table created and being the head
of the table list, then when we do replace, the tables get deleted
(called three times with the same argument since it is in the
ipv4/ipv6/l2 table index - I should probably fix that), and the next
round of tables being created returns the new indices in the order of
3,2,1,0.

So, the table #0 is created last, is the new head of the chain - and
consequently this is the new index we see.

This difference this order comes from the logic that the
vnet_classify_new_table() uses to allocate new tables - it uses the
pool_get_aligned(), which first tries to reallocate the newly freed
elements from the freelist before expanding the pool. The freelist is
a vector, with the latest freed element at the tail of it. And the new
allocations use that element. Because table #3 is the head of the
chain, when deleting the chain, it is freed last. So, in the new
allocation it becomes the first table to be allocated. So, we get the
indices in reverse order.
If we do the same operation again, we will see that this phenomenon is
(as expected) symmetric - this time index 0 is the head of the list,
so it is freed up last, and becomes the first one available for
allocation by the pool alloc. So, if we do two updates, the indices
get again in the order of 0,1,2,3, and the head of the table chain
becomes again 3.

>
> Despite what Gerrit thinks, this code does compile and run for me!
> So maybe just a "rebuild" request there will allow it to verify?

Actually, the faulty test case (which was supposed to detect this) was botched
in a more interesting way than I thought. It seems to be a copy paste
or a merge error of some sorts, and it succeeded when the bug that you
caught was present, and failed when it was fixed... After staring at
the logic of it for a couple of hours I changed it

Re: [vpp-dev] MACIP ACL replace causes ip4_table_index change?

2017-12-09 Thread Dave Barach (dbarach)
This looks wrong... vnet_set_input_acl_intfc(...) at line 93:

  /* Return ok on ADD operation if feature is already enabled */
  if (is_add &&
   am->classify_table_index_by_sw_if_index[ti][sw_if_index] != ~0)
 return 0;

It’s been that way for a very long time.

Thanks… Dave

From: vpp-dev-boun...@lists.fd.io [mailto:vpp-dev-boun...@lists.fd.io] On 
Behalf Of Jon Loeliger
Sent: Saturday, December 9, 2017 11:23 AM
To: Andrew 👽 Yourtchenko 
Cc: vpp-dev 
Subject: Re: [vpp-dev] MACIP ACL replace causes ip4_table_index change?

On Sat, Dec 9, 2017 at 8:16 AM, Andrew 👽 Yourtchenko 
mailto:ayour...@gmail.com>> wrote:
Jon,

Hi Andrew,

Thanks for taking a look at this issue!

on api trace: does the below work ? (even though the current scenario
is trivially reproducible, the api traces are very useful for tougher
cases, and save a lot of typing while storytelling).

DBGvpp# api trace on

. do the things 

DBGvpp# api trace save macip-trace
API trace saved to /tmp/macip-trace
DBGvpp# api trace custom-dump /tmp/macip-trace
SCRIPT: macip_acl_add_replace -1 count 1 count 1 \
  ipv4 permit \
src mac 00:00:00:00:00:00 mask 00:00:00:00:00:00 \
src ip 0.0.0.0/0<http://0.0.0.0/0>, \

SCRIPT: macip_acl_interface_add_del sw_if_index 0 acl_index 0 add
SCRIPT: macip_acl_add_replace 0 count 1 count 1 \
  ipv4 permit \
src mac 00:00:00:00:00:00 mask 00:00:00:00:00:00 \
src ip 0.0.0.0/0<http://0.0.0.0/0>, \

I think that is the right sequence.


Now, to the issue itself: it's exactly as I described, but with a twist:
vnet_set_input_acl_intfc(), which is used under the hood to assign the
inacl on the interfaces, is quite picky - if there is an existing
inacl applied, it just quietly does nothing. (@DaveB - this kinda
feels strange, I am not sure what the logic is behind doing that.)

Anyway, rather than debating on why it behaves this way, and,
especially since we actually are deleting the tables in question, it's
better to unapply the inacls first, and then reapply them after the
tables have been recreated.

This solves half the problem for me.  It looks like I can properly
turn around and remove this ACL from the interface now!

But I still have doubts; or at least I don't understand why the
three table indices are 3 after initial creation, and 0 after they
are replaced.

The result is in https://gerrit.fd.io/r/#/c/9772/ - you can verify
that it addresses the issue.

I've left a comment on the code there.

Despite what Gerrit thinks, this code does compile and run for me!
So maybe just a "rebuild" request there will allow it to verify?


Now, going on to "how exactly did this slip through" - seems the macip
tests are quite a bit too lenient than they should be. I'll need to
address that as well, though probably I will split the dot1q/dot1ad
test cases out, and in the process refactor things a bit... so in the
interests of your time, maybe 9772 can go with just an actual code
fix.

I've not read through the test as they stand today.
I'd like to understand the "3 vs. 0" issue before I am happy to
Code Review +1 this patch.

I've dumped the entire debugging process into a log, which turned out
be fairly long, so to avoid sending the walls of text to the list,
I've dumped it elsewhere:
http://stdio.be/blog/2017-12-09-Debugging-VPP-MACIP-ACLs/

And, excellent!  I read through that in quite some detail.  And I understand
the "3 vs 0" issue I was seeing now!

The two pieces I missed were:  The "show inacl type l2" to see where
the chain was starting, and then noticing that the chain had indeed been
reversed once the starting point was known.

So thanks for that excellent debug layout and explanation.

And thanks for including the intermediate step of showing why simply
updating the interface at the end wasn't sufficient.  I had done that,
but hadn't yet gotten into the next function to see it was being ignored.

Thanks again for catching this.

Thanks for fixing this!


--a

So, with that, I've left  a comment and a Review -1 on the patch.
And the patch didn't Verify.  So where do we go from here?

I'm good with the patch, and we need to rebuild it.  So do we just
re-build the same patch or re-submit it as a new patch?  I will
either update or Review a-new to +1 it!

Thanks,
jdl

___
vpp-dev mailing list
vpp-dev@lists.fd.io
https://lists.fd.io/mailman/listinfo/vpp-dev

Re: [vpp-dev] MACIP ACL replace causes ip4_table_index change?

2017-12-09 Thread Jon Loeliger
On Sat, Dec 9, 2017 at 8:16 AM, Andrew 👽 Yourtchenko 
wrote:

> Jon,
>

Hi Andrew,

Thanks for taking a look at this issue!


> on api trace: does the below work ? (even though the current scenario
> is trivially reproducible, the api traces are very useful for tougher
> cases, and save a lot of typing while storytelling).
>
> DBGvpp# api trace on
>
> . do the things 
>
> DBGvpp# api trace save macip-trace
> API trace saved to /tmp/macip-trace
> DBGvpp# api trace custom-dump /tmp/macip-trace
> SCRIPT: macip_acl_add_replace -1 count 1 count 1 \
>   ipv4 permit \
> src mac 00:00:00:00:00:00 mask 00:00:00:00:00:00 \
> src ip 0.0.0.0/0, \
>
> SCRIPT: macip_acl_interface_add_del sw_if_index 0 acl_index 0 add
> SCRIPT: macip_acl_add_replace 0 count 1 count 1 \
>   ipv4 permit \
> src mac 00:00:00:00:00:00 mask 00:00:00:00:00:00 \
> src ip 0.0.0.0/0, \
>

I think that is the right sequence.


Now, to the issue itself: it's exactly as I described, but with a twist:
> vnet_set_input_acl_intfc(), which is used under the hood to assign the
> inacl on the interfaces, is quite picky - if there is an existing
> inacl applied, it just quietly does nothing. (@DaveB - this kinda
> feels strange, I am not sure what the logic is behind doing that.)
>
> Anyway, rather than debating on why it behaves this way, and,
> especially since we actually are deleting the tables in question, it's
> better to unapply the inacls first, and then reapply them after the
> tables have been recreated.
>

This solves half the problem for me.  It looks like I can properly
turn around and remove this ACL from the interface now!

But I still have doubts; or at least I don't understand why the
three table indices are 3 after initial creation, and 0 after they
are replaced.

The result is in https://gerrit.fd.io/r/#/c/9772/ - you can verify
> that it addresses the issue.
>

I've left a comment on the code there.

Despite what Gerrit thinks, this code does compile and run for me!
So maybe just a "rebuild" request there will allow it to verify?



> Now, going on to "how exactly did this slip through" - seems the macip
> tests are quite a bit too lenient than they should be. I'll need to
> address that as well, though probably I will split the dot1q/dot1ad
> test cases out, and in the process refactor things a bit... so in the
> interests of your time, maybe 9772 can go with just an actual code
> fix.
>

I've not read through the test as they stand today.
I'd like to understand the "3 vs. 0" issue before I am happy to
Code Review +1 this patch.


> I've dumped the entire debugging process into a log, which turned out
> be fairly long, so to avoid sending the walls of text to the list,
> I've dumped it elsewhere:
> http://stdio.be/blog/2017-12-09-Debugging-VPP-MACIP-ACLs/


And, excellent!  I read through that in quite some detail.  And I understand
the "3 vs 0" issue I was seeing now!

The two pieces I missed were:  The "show inacl type l2" to see where
the chain was starting, and then noticing that the chain had indeed been
reversed once the starting point was known.

So thanks for that excellent debug layout and explanation.

And thanks for including the intermediate step of showing why simply
updating the interface at the end wasn't sufficient.  I had done that,
but hadn't yet gotten into the next function to see it was being ignored.

Thanks again for catching this.


Thanks for fixing this!


>
> --a


So, with that, I've left  a comment and a Review -1 on the patch.
And the patch didn't Verify.  So where do we go from here?

I'm good with the patch, and we need to rebuild it.  So do we just
re-build the same patch or re-submit it as a new patch?  I will
either update or Review a-new to +1 it!

Thanks,
jdl
___
vpp-dev mailing list
vpp-dev@lists.fd.io
https://lists.fd.io/mailman/listinfo/vpp-dev

Re: [vpp-dev] MACIP ACL replace causes ip4_table_index change?

2017-12-09 Thread Andrew 👽 Yourtchenko
Jon,

on api trace: does the below work ? (even though the current scenario
is trivially reproducible, the api traces are very useful for tougher
cases, and save a lot of typing while storytelling).

DBGvpp# api trace on

. do the things 

DBGvpp# api trace save macip-trace
API trace saved to /tmp/macip-trace
DBGvpp# api trace custom-dump /tmp/macip-trace
SCRIPT: macip_acl_add_replace -1 count 1 count 1 \
  ipv4 permit \
src mac 00:00:00:00:00:00 mask 00:00:00:00:00:00 \
src ip 0.0.0.0/0, \

SCRIPT: macip_acl_interface_add_del sw_if_index 0 acl_index 0 add
SCRIPT: macip_acl_add_replace 0 count 1 count 1 \
  ipv4 permit \
src mac 00:00:00:00:00:00 mask 00:00:00:00:00:00 \
src ip 0.0.0.0/0, \

Now, to the issue itself: it's exactly as I described, but with a twist:
vnet_set_input_acl_intfc(), which is used under the hood to assign the
inacl on the interfaces, is quite picky - if there is an existing
inacl applied, it just quietly does nothing. (@DaveB - this kinda
feels strange, I am not sure what the logic is behind doing that.)

Anyway, rather than debating on why it behaves this way, and,
especially since we actually are deleting the tables in question, it's
better to unapply the inacls first, and then reapply them after the
tables have been recreated.

The result is in https://gerrit.fd.io/r/#/c/9772/ - you can verify
that it addresses the issue.

Now, going on to "how exactly did this slip through" - seems the macip
tests are quite a bit too lenient than they should be. I'll need to
address that as well, though probably I will split the dot1q/dot1ad
test cases out, and in the process refactor things a bit... so in the
interests of your time, maybe 9772 can go with just an actual code
fix.

I've dumped the entire debugging process into a log, which turned out
be fairly long, so to avoid sending the walls of text to the list,
I've dumped it elsewhere:
http://stdio.be/blog/2017-12-09-Debugging-VPP-MACIP-ACLs/

Thanks again for catching this.

--a


On 12/8/17, Jon Loeliger  wrote:
> On Fri, Dec 8, 2017 at 12:55 PM, Jon Loeliger  wrote:
>>
>>
>>
>>> We need to iterate the am->macip_acl_by_sw_if_index and for interfaces
>>> which have the acl in question applied, reapply these new tables to
>>> these
>>> interfaces so the already active macip acl remained active.
>>>
>>
>> Ah, I see
>>
>
> I'm not convinced this is the issue.  I added that loop, and it doesn't
> change things.  Specifically, a->ip4_table_index is 0, not 3 after the
> creation of the classify tables.
>
> So, here is sort of a trace through some key places of the
> classify table construction.  It follows the API sequence I
> outlined in an earlier message (create MACIP with one permit rule,
> then bind to interface, then modify that MACIP's rule to deny
> and update it using macip_acl_add_replace).
>
> I'll annotate this some.
>
> Here is the initial creation of the classify tables:
>
> macip_acl_add_list: looks like a create
> macip_create_classify_tables: top a->count 1
> macip_create_classify_tables: last_table before 4294967295
>
> Now we create the rules in reverse order:
>
> macip_create_classify_tables: ARP looop top
> macip_create_classify_tables: ARP loop last_table 0
> macip_create_classify_tables: IP[46] loop top
> macip_create_classify_tables: IP[46]  loop last_table 1
> macip_create_classify_tables: IP[46]  loop last_table 2
> macip_create_classify_tables: IP[46]  loop last_table 3
> macip_create_classify_tables: ip4_t_i 3
> macip_create_classify_tables: ip6_t_i 3
> macip_create_classify_tables:  l2_t_i 3
>
> That leaves last_table at 3, the most recently created rule,
> with a chain from 0 through 3:
>
> vpp# show classify tables
>   TableIdx  Sessions   NextTbl  NextNode
>  0 1 1-1
>   Heap: 3 objects, 204 of 2k used, 76 free, 0 reclaimed, 1k overhead, 8188k
> capacity
>   nbuckets 32, skip 0 match 2 flag 0 offset 0
>   mask 
>   linear-search buckets 0
>  1 1 2-1
>   Heap: 3 objects, 204 of 2k used, 76 free, 0 reclaimed, 1k overhead, 8188k
> capacity
>   nbuckets 32, skip 0 match 2 flag 0 offset 0
>   mask 
>   linear-search buckets 0
>  2 1 3-1
>   Heap: 3 objects, 236 of 2k used, 76 free, 0 reclaimed, 1k overhead, 8188k
> capacity
>   nbuckets 32, skip 0 match 3 flag 0 offset 0
>   mask :
> 
>0020: 
>   linear-search buckets 0
>  3 1-1 0
>   Heap: 3 objects, 204 of 2k used, 76 free, 0 reclaimed, 1k overhead, 8188k
> capacity
>   nbuckets 32, skip 0 match 2 flag 0 offset 0
>   mask 
>
> Now I bind that MACIP to the interface, and get:
>
> vpp# show acl-plugin macip inter

Re: [vpp-dev] MACIP ACL replace causes ip4_table_index change?

2017-12-08 Thread Jon Loeliger
On Fri, Dec 8, 2017 at 12:55 PM, Jon Loeliger  wrote:
>
>
>
>> We need to iterate the am->macip_acl_by_sw_if_index and for interfaces
>> which have the acl in question applied, reapply these new tables to these
>> interfaces so the already active macip acl remained active.
>>
>
> Ah, I see
>

I'm not convinced this is the issue.  I added that loop, and it doesn't
change things.  Specifically, a->ip4_table_index is 0, not 3 after the
creation of the classify tables.

So, here is sort of a trace through some key places of the
classify table construction.  It follows the API sequence I
outlined in an earlier message (create MACIP with one permit rule,
then bind to interface, then modify that MACIP's rule to deny
and update it using macip_acl_add_replace).

I'll annotate this some.

Here is the initial creation of the classify tables:

macip_acl_add_list: looks like a create
macip_create_classify_tables: top a->count 1
macip_create_classify_tables: last_table before 4294967295

Now we create the rules in reverse order:

macip_create_classify_tables: ARP looop top
macip_create_classify_tables: ARP loop last_table 0
macip_create_classify_tables: IP[46] loop top
macip_create_classify_tables: IP[46]  loop last_table 1
macip_create_classify_tables: IP[46]  loop last_table 2
macip_create_classify_tables: IP[46]  loop last_table 3
macip_create_classify_tables: ip4_t_i 3
macip_create_classify_tables: ip6_t_i 3
macip_create_classify_tables:  l2_t_i 3

That leaves last_table at 3, the most recently created rule,
with a chain from 0 through 3:

vpp# show classify tables
  TableIdx  Sessions   NextTbl  NextNode
 0 1 1-1
  Heap: 3 objects, 204 of 2k used, 76 free, 0 reclaimed, 1k overhead, 8188k
capacity
  nbuckets 32, skip 0 match 2 flag 0 offset 0
  mask 
  linear-search buckets 0
 1 1 2-1
  Heap: 3 objects, 204 of 2k used, 76 free, 0 reclaimed, 1k overhead, 8188k
capacity
  nbuckets 32, skip 0 match 2 flag 0 offset 0
  mask 
  linear-search buckets 0
 2 1 3-1
  Heap: 3 objects, 236 of 2k used, 76 free, 0 reclaimed, 1k overhead, 8188k
capacity
  nbuckets 32, skip 0 match 3 flag 0 offset 0
  mask :

   0020: 
  linear-search buckets 0
 3 1-1 0
  Heap: 3 objects, 204 of 2k used, 76 free, 0 reclaimed, 1k overhead, 8188k
capacity
  nbuckets 32, skip 0 match 2 flag 0 offset 0
  mask 

Now I bind that MACIP to the interface, and get:

vpp# show acl-plugin macip interface
  sw_if_index 0: -1
  sw_if_index 1: 0
vpp# show acl-plugin macip acl
MACIP acl_index: 0, count: 1 (true len 1) tag {bob} is free pool slot: 0
  ip4_table_index 3, ip6_table_index 3, l2_table_index 3
rule 0: ipv4 action 1 ip 0.0.0.0/0 mac 00:00:00:00:00:00 mask
00:00:00:00:00:00

Now I update the MACIP rule, changing "permit" to "deny".
And I call macip_acl_add_replace():

macip_acl_add_list: replacing ACLs now, acl_list_index is 0
macip_acl_interface_add_acl: macip_acl_index 0
macip_acl_add_list: looks like a replace
macip_create_classify_tables: top a->count 1
macip_create_classify_tables: last_table before 4294967295

That's just like before.  Good.

And now we count down the existing matching rules:

macip_create_classify_tables: ARP looop top
macip_create_classify_tables: ARP loop last_table 3
macip_create_classify_tables: IP[46] loop top
macip_create_classify_tables: IP[46]  loop last_table 2
macip_create_classify_tables: IP[46]  loop last_table 1
macip_create_classify_tables: IP[46]  loop last_table 0

And leave last_table at 0.
Which we then store in the {ip4,ip6,l2}_table_index variables:

macip_create_classify_tables: ip4_t_i 0
macip_create_classify_tables: ip6_t_i 0
macip_create_classify_tables:  l2_t_i 0

This is my loop added to the bottom of macip_acl_add_list():

  for (i = 0; i < vec_len (am->macip_acl_by_sw_if_index); i++)
{
  vlib_cli_output (vm, "%s: check swif %d\n", __func__, i);

  if (am->macip_acl_by_sw_if_index[i] == *acl_list_index)
{
  vlib_cli_output (vm, "%s: re-add acl_list_index %d\n", __func__,
*acl\
_list_index);
  macip_acl_interface_add_acl (am, i, *acl_list_index);
}
}


macip_acl_add_list: replacing ACLs now, acl_list_index is 0
macip_acl_add_list: check swif 0
macip_acl_add_list: check swif 1
macip_acl_add_list: re-add acl_list_index 0
macip_acl_interface_add_acl: macip_acl_index 0

So the MACIP ACL is applied to the SW IF, I believe, but
the MACIP ACL itself appears to be pointing to the wrong
end of the classify table chain in either the "create" case,
or in the "replace" case.  But I think they should be the same.

HTH,
jdl
_

Re: [vpp-dev] MACIP ACL replace causes ip4_table_index change?

2017-12-08 Thread Jon Loeliger
On Fri, Dec 8, 2017 at 12:34 PM, Andrew Yourtchenko 
wrote:

> Jon,
>
> Do you have an api trace you could throw in my direction unicast ?
>

I'd have to work at that quite a bit as I am using a totally new and
different UI to drive these API calls...  I can outline it for you:

macip_index = create macip acl with one rule "permit"
call interface bind (some sw_if_index, macip_index)

show the classifier tables
show the MACIP interface bindings
show the MACIP acls

call macip_acl_add_replace(macip_index, "change rule to deny")

show the new classifier tables (look ok)
show the MACIP interface bindings (look ok)
show the MACIP acls (look ok except the {ip4,ip6,l2}_table_index values
are 0; should be 3?)


> Macip_add_replace call was added later than most ACL plugins, (after
> realizing the convenience of this approach in the policy ACLs vs the
> unapply/delete acl/readd acl/reapply and for consistency), via commit
> c29940c58de3e44c0c1dd5c4eda5e0268d963b14
>

I know.  In fact, I had a different bug reported against our UI due
to the old way of removing the rules and then not getting them
re-applied upon addition correctly.

So when I saw the new API call (add_replace), I jumped on it!

I think I've just re-discovered the same underlying problem just
two layers deeper into the MACIP implementation now. :-/



> That change modified the existing “macip_acl_add_list” routine to also get
> the replace semantics, by deleting and then recreating the classifier
> tables anew.
>

I've just read through the macip_create_classifier_tables(), so yeah...


> This doesn’t work very well if the classifier tables are applied
> somewhere: they are removed, but the new tables are not applied
> automatically.
>

Right.  That's what I think I am seeing.


> We need to iterate the am->macip_acl_by_sw_if_index and for interfaces
> which have the acl in question applied, reapply these new tables to these
> interfaces so the already active macip acl remained active.
>

Ah, I see


> This is a bug. Sorry for that.
>

No worries.  I've written a bug or two in my day...


>  Since there were also tests in that commit which ideally should have
> caught this, i will also need to take a double check at what is going on
> there (my guess missing negative tests), and fix it too, but not sure I can
> pull off thumbtyping in a vi session,
>

I see your problem here. No one should have to use vi. :-)


> so I will add you to the changerequest with the fix once I have it, hope
> in a day or so.
>

Excellent!

Thank you!
jdl
___
vpp-dev mailing list
vpp-dev@lists.fd.io
https://lists.fd.io/mailman/listinfo/vpp-dev

Re: [vpp-dev] MACIP ACL replace causes ip4_table_index change?

2017-12-08 Thread Andrew Yourtchenko
Jon,

Do you have an api trace you could throw in my direction unicast ?

Macip_add_replace call was added later than most ACL plugins, (after realizing 
the convenience of this approach in the policy ACLs vs the unapply/delete 
acl/readd acl/reapply and for consistency), via commit 
c29940c58de3e44c0c1dd5c4eda5e0268d963b14

That change modified the existing “macip_acl_add_list” routine to also get the 
replace semantics, by deleting and then recreating the classifier tables anew.

This doesn’t work very well if the classifier tables are applied somewhere: 
they are removed, but the new tables are not applied automatically. We need to 
iterate the am->macip_acl_by_sw_if_index and for interfaces which have the acl 
in question applied, reapply these new tables to these interfaces so the 
already active macip acl remained active. This is a bug. Sorry for that.

 Since there were also tests in that commit which ideally should have caught 
this, i will also need to take a double check at what is going on there (my 
guess missing negative tests), and fix it too, but not sure I can pull off 
thumbtyping in a vi session, so I will add you to the changerequest with the 
fix once I have it, hope in a day or so.

--a

>>> On 8 Dec 2017, at 17:28, Jon Loeliger  wrote:
>>> 
>>> On Fri, Dec 8, 2017 at 11:24 AM, Jon Loeliger  wrote:
>>> On Fri, Dec 8, 2017 at 10:56 AM, Jon Loeliger  wrote:
>>> 
>>> vpp# show acl-plugin macip acl
>>> MACIP acl_index: 0, count: 1 (true len 1) tag {bob} is free pool slot: 0
>>>   ip4_table_index 0, ip6_table_index 0, l2_table_index 0
>>> rule 0: ipv4 action 0 ip 0.0.0.0/0 mac 00:00:00:00:00:00 mask 
>>> 00:00:00:00:00:00
>>> 
>>> Notice that the ip4_table_index has changed from 3 in the first two 'show'
>>> command outputs, while it is now 0 in the 3rd 'show' output.
>>> 
>>> My guess is it should be a consistent value throughout, and I think it 
>>> should
>>> be table 3, but I'm not certain yet.
>>> 
>>> When I then go to remove the MACIP from the interface, I am told error -65,
>>> which is "No such table."
>>> 
>>> Should it have copied the ip4_table_index 3 to the replaced MACIP as it 
>>> stands
>>> after the macip_add_replace API call?
>>> 
>>> Or should the original MACIP ACL have inherited the table number 0 from the
>>> interface when it was first bound there?
> 
> Sorry.  I also hate gmail.  But I could digress acerbically.
> 
> Anyway.
> 
> I meant to say that I've read enough to know now that these are classifier
> table indices.  Which means the 3rd, unlisted option is more likely here:
> As the ACL changed, update these table to use the _new_ classifier table
> indices.  That should likely be table 4 unless the classifier can positively
> determine that there are no other users of tables 3 and can do it in place,
> in which case 3 is the correct answer.
> 
> But not 0.
> 
> Thanks,
> jdl
> 
> ___
> vpp-dev mailing list
> vpp-dev@lists.fd.io
> https://lists.fd.io/mailman/listinfo/vpp-dev
___
vpp-dev mailing list
vpp-dev@lists.fd.io
https://lists.fd.io/mailman/listinfo/vpp-dev

Re: [vpp-dev] MACIP ACL replace causes ip4_table_index change?

2017-12-08 Thread Jon Loeliger
On Fri, Dec 8, 2017 at 11:24 AM, Jon Loeliger  wrote:

> On Fri, Dec 8, 2017 at 10:56 AM, Jon Loeliger  wrote:
>
>>
>> vpp# show acl-plugin macip acl
>> MACIP acl_index: 0, count: 1 (true len 1) tag {bob} is free pool slot: 0
>>   ip4_table_index 0, ip6_table_index 0, l2_table_index 0
>> rule 0: ipv4 action 0 ip 0.0.0.0/0 mac 00:00:00:00:00:00 mask
>> 00:00:00:00:00:00
>>
>> Notice that the ip4_table_index has changed from 3 in the first two 'show'
>> command outputs, while it is now 0 in the 3rd 'show' output.
>>
>> My guess is it should be a consistent value throughout, and I think it
>> should
>> be table 3, but I'm not certain yet.
>>
>> When I then go to remove the MACIP from the interface, I am told error
>> -65,
>> which is "No such table."
>>
>> Should it have copied the ip4_table_index 3 to the replaced MACIP as it
>> stands
>> after the macip_add_replace API call?
>>
>> Or should the original MACIP ACL have inherited the table number 0 from
>> the
>> interface when it was first bound there?
>>
>
Sorry.  I also hate gmail.  But I could digress acerbically.

Anyway.

I meant to say that I've read enough to know now that these are classifier
table indices.  Which means the 3rd, unlisted option is more likely here:
As the ACL changed, update these table to use the _new_ classifier table
indices.  That should likely be table 4 unless the classifier can positively
determine that there are no other users of tables 3 and can do it in place,
in which case 3 is the correct answer.

But not 0.

Thanks,
jdl
___
vpp-dev mailing list
vpp-dev@lists.fd.io
https://lists.fd.io/mailman/listinfo/vpp-dev

Re: [vpp-dev] MACIP ACL replace causes ip4_table_index change?

2017-12-08 Thread Jon Loeliger
On Fri, Dec 8, 2017 at 10:56 AM, Jon Loeliger  wrote:

>
> vpp# show acl-plugin macip acl
> MACIP acl_index: 0, count: 1 (true len 1) tag {bob} is free pool slot: 0
>   ip4_table_index 0, ip6_table_index 0, l2_table_index 0
> rule 0: ipv4 action 0 ip 0.0.0.0/0 mac 00:00:00:00:00:00 mask
> 00:00:00:00:00:00
>
> Notice that the ip4_table_index has changed from 3 in the first two 'show'
> command outputs, while it is now 0 in the 3rd 'show' output.
>
> My guess is it should be a consistent value throughout, and I think it
> should
> be table 3, but I'm not certain yet.
>
> When I then go to remove the MACIP from the interface, I am told error -65,
> which is "No such table."
>
> So.
>
> Should it have copied the ip4_table_index 3 to the replaced MACIP as it
> stands
> after the macip_add_replace API call?
>
> Or should the original MACIP ACL have inherited the table number 0 from the
> interface when it was first bound there?
>
___
vpp-dev mailing list
vpp-dev@lists.fd.io
https://lists.fd.io/mailman/listinfo/vpp-dev

[vpp-dev] MACIP ACL replace causes ip4_table_index change?

2017-12-08 Thread Jon Loeliger
Hey VPP Fans,

I've detected a slight anomaly in the handling of MACIP ACLs, and
would like some help tracking down the right solution.

I start by making a MACIP ACL. vppctl shows:

vpp# show acl-plugin macip acl
MACIP acl_index: 0, count: 1 (true len 1) tag {bob} is free pool slot: 0
  ip4_table_index 3, ip6_table_index 3, l2_table_index 3
rule 0: ipv4 action 1 ip 0.0.0.0/0 mac 00:00:00:00:00:00 mask
00:00:00:00:00:00

I then attach that to an interface, and vppctl still shows:

vpp# show acl-plugin macip acl
MACIP acl_index: 0, count: 1 (true len 1) tag {bob} is free pool slot: 0
  ip4_table_index 3, ip6_table_index 3, l2_table_index 3
rule 0: ipv4 action 1 ip 0.0.0.0/0 mac 00:00:00:00:00:00 mask
00:00:00:00:00:00

Then, I change the MACIP rule from permit to deny using the
API call macip_acl_add_replace to adjust it in-place.  Now vppctl shows:

vpp# show acl-plugin macip acl
MACIP acl_index: 0, count: 1 (true len 1) tag {bob} is free pool slot: 0
  ip4_table_index 0, ip6_table_index 0, l2_table_index 0
rule 0: ipv4 action 0 ip 0.0.0.0/0 mac 00:00:00:00:00:00 mask
00:00:00:00:00:00

Notice that the ip4_table_index has changed from 3 in the first two 'show'
command outputs, while it is now 0 in the 3rd 'show' output.

My guess is it should be a consistent value throughout, and I think it
should
be table 3, but I'm not certain yet.

When I then go to remove the MACIP from the interface, I am told error -65,
which is "No such table."

So.

Should it have copied the ip4_table_index 3 to the replaced MACIP as it
stands
after the macip_add_replace API call?

Or should the original MACIP ACL have inherited the table number 0 from the
interface when it was first bound there?

Given that the complaint (upon deletion) is about table 0 being invalid (as
it should
because that table is "permanently present", right?), I suspect that it
should have
copied the 3 to the new (after replacement) MACIP.

I'll go digging some more, but thought I'd just throw this out there in case
anyone knows better or more than I.

Thanks,
jdl
___
vpp-dev mailing list
vpp-dev@lists.fd.io
https://lists.fd.io/mailman/listinfo/vpp-dev