Re: [ovs-dev] [PATCH v10 4/5] dpif-netdev: refactor generic implementation

2019-07-12 Thread Stokes, Ian
> On Fri, Jul 12, 2019 at 04:20:06PM +, Van Haaren, Harry wrote:
> > > The sizeof operator is unique among C operators in that it accepts two
> > > very different kinds of operands: an expression or a type. In general,
> > > prefer to specify an expression, e.g. int *x = xmalloc(sizeof *x);.
> When
> > > the operand of sizeof is an expression, there is no need to
> parenthesize
> > > that operand, and please don't.
> >
> > Hmmm, I get compile failures here with "sizeof uint64_t" instead of
> "sizeof(uint64_t)".
> > There's also some ambiguity possible, see here:
> https://stackoverflow.com/a/26702432
> >
> > In short, I've left the sizeof() with brackets, as code that compiles
> wins in the end.
> 
> uint64_t is an example of a type rather than an expression.  With a
> type, the parentheses are required.

Ah, thanks for clarifying Ben/Ilya.

Regards
Ian

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v10 4/5] dpif-netdev: refactor generic implementation

2019-07-12 Thread Ben Pfaff
On Fri, Jul 12, 2019 at 04:20:06PM +, Van Haaren, Harry wrote:
> > The sizeof operator is unique among C operators in that it accepts two
> > very different kinds of operands: an expression or a type. In general,
> > prefer to specify an expression, e.g. int *x = xmalloc(sizeof *x);. When
> > the operand of sizeof is an expression, there is no need to parenthesize
> > that operand, and please don't.
> 
> Hmmm, I get compile failures here with "sizeof uint64_t" instead of 
> "sizeof(uint64_t)".
> There's also some ambiguity possible, see here: 
> https://stackoverflow.com/a/26702432
> 
> In short, I've left the sizeof() with brackets, as code that compiles wins in 
> the end.

uint64_t is an example of a type rather than an expression.  With a
type, the parentheses are required.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v10 4/5] dpif-netdev: refactor generic implementation

2019-07-12 Thread Ilya Maximets
On 12.07.2019 19:20, Van Haaren, Harry wrote:
>>> +subtable->mf_masks = xmalloc(sizeof(uint64_t) * (unit0 + unit1));
>>> +netdev_flow_key_gen_masks(mask, subtable->mf_masks, unit0, unit1);
>>> +
>>> +/* Allocate blocks scratch space only if subtable requires more size
>> than
>>> + * is currently allocated. */
>>> +const uint32_t blocks_required_per_pkt = unit0 + unit1;
>>> +if (cls->blocks_scratch_size < blocks_required_per_pkt) {
>>> +free(cls->blocks_scratch);
>>> +cls->blocks_scratch = xmalloc(sizeof(uint64_t) * NETDEV_MAX_BURST *
>>> +  blocks_required_per_pkt);
>>
>> For sizeof above I don't think you need the () as it's part of an
>> expression. See OVS code standard below. This may apply to the xmalloc
>> for mf_masks above as well.
>>
>> The sizeof operator is unique among C operators in that it accepts two
>> very different kinds of operands: an expression or a type. In general,
>> prefer to specify an expression, e.g. int *x = xmalloc(sizeof *x);. When
>> the operand of sizeof is an expression, there is no need to parenthesize
>> that operand, and please don't.
> 
> Hmmm, I get compile failures here with "sizeof uint64_t" instead of 
> "sizeof(uint64_t)".
> There's also some ambiguity possible, see here: 
> https://stackoverflow.com/a/26702432
> 
> In short, I've left the sizeof() with brackets, as code that compiles wins in 
> the end.

It's suggested to use sizeof without brackets but for the pointer variable.
So, in OVS coding style this should look like:

   cls->blocks_scratch = xmalloc(NETDEV_MAX_BURST
 * blocks_required_per_pkt
 * sizeof *cls->blocks_scratch);

For mf_masks:

   subtable->mf_masks = xmalloc((unit0 + unit1) * sizeof *subtable->mf_masks);

Best regards, Ilya Maximets.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v10 4/5] dpif-netdev: refactor generic implementation

2019-07-12 Thread Van Haaren, Harry
> -Original Message-
> From: Stokes, Ian
> Sent: Friday, July 12, 2019 3:19 PM
> To: Van Haaren, Harry ; d...@openvswitch.org
> Cc: i.maxim...@samsung.com; malvika.gu...@arm.com
> Subject: Re: [PATCH v10 4/5] dpif-netdev: refactor generic implementation
> 
> On 7/9/2019 1:34 PM, Harry van Haaren wrote:
> > This commit refactors the generic implementation. The
> > goal of this refactor is to simply the code to enable
> 'simply' -> 'simplify'?

Simply done now :)


> > -/* Returns a hash value for the bits of 'key' where there are 1-bits in
> > - * 'mask'. */
> > -static inline uint32_t
> > -netdev_flow_key_hash_in_mask(const struct netdev_flow_key *key,
> > - const struct netdev_flow_key *mask)
> > +VLOG_DEFINE_THIS_MODULE(dpif_lookup_generic);
> > +
> > +/* netdev_flow_key_flatten_unit:
> 
> No need to mention function name in comments. Applies to other function
> comments in the patch also.
> 
> For function comments in OVS you can follow:
> 
> http://docs.openvswitch.org/en/latest/internals/contributing/coding-
> style/#functions

Ah thanks - removed function names from comments.

> 
> > + * Given a packet, table and mf_masks, this function iterates over each bit
> > + * set in the subtable, and calculates the appropriate metadata to store in
> the
> > + * blocks_scratch[].
> > + *
> > + * The results of the blocks_scratch[] can be used for hashing, and later
> for
> > + * verification of if a rule matches the given packet.
> > + */
> > +static inline void
> > +netdev_flow_key_flatten_unit(const uint64_t *pkt_blocks,
> > + const uint64_t *tbl_blocks,
> > + const uint64_t *mf_masks,
> > + uint64_t *blocks_scratch,
> > + const uint64_t pkt_mf_bits,
> > + const uint32_t count)
> >   {
> > -const uint64_t *p = miniflow_get_values(>mf);
> > -uint32_t hash = 0;
> > -uint64_t value;
> > +uint32_t i;
> New line just to separate variable declaration form function body.

Done.


> > +for (i = 0; i < count; i++) {
> > +uint64_t mf_mask = mf_masks[i];
> > +/* Calculate the block index for the packet metadata */
> > +uint64_t idx_bits = mf_mask & pkt_mf_bits;
> > +const uint32_t pkt_idx = count_1bits(idx_bits);
> >
> > -NETDEV_FLOW_KEY_FOR_EACH_IN_FLOWMAP (value, key, mask->mf.map) {
> > -hash = hash_add64(hash, value & *p);
> > -p++;
> > +/* check if the packet has the subtable miniflow bit set. If yes,
> the
> 
> For above and for other comments in the patch, capitalization at
> beginning, period to finish.

Fixed in a number of places. Kinda like Pokemon, tryna catch 'em all!



> > +/* Hash the now linearized blocks of packet metadata */
> > +ULLONG_FOR_EACH_1(i, keys_map) {
> > + uint32_t hash = 0;
> > + uint32_t i_off = i * bit_count_total;
> > + for (int h = 0; h < bit_count_total; h++) {
> > + hash = hash_add64(hash, blocks_scratch[i_off + h]);
> > + }
> > + hashes[i] = hash_finish(hash, bit_count_total * 8);
> 
> Can we replace magic 8 above.

Done. Sizeof() with parenthesis required, failed to build without here.



> > diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
> > index 190cc8918..03ab5267a 100644
> > --- a/lib/dpif-netdev.c
> > +++ b/lib/dpif-netdev.c
> > @@ -233,6 +233,15 @@ struct dpcls {
> >   odp_port_t in_port;
> >   struct cmap subtables_map;
> >   struct pvector subtables;
> > +
> > +/* Region of memory for this DPCLS instance to use as scratch.
> > + * Size is garaunteed to be large enough to hold all blocks required
> for
> 'garaunteed' -> 'guaranteed'

Done.



> > +subtable->mf_masks = xmalloc(sizeof(uint64_t) * (unit0 + unit1));
> > +netdev_flow_key_gen_masks(mask, subtable->mf_masks, unit0, unit1);
> > +
> > +/* Allocate blocks scratch space only if subtable requires more size
> than
> > + * is currently allocated. */
> > +const uint32_t blocks_required_per_pkt = unit0 + unit1;
> > +if (cls->blocks_scratch_size < blocks_required_per_pkt) {
> > +free(cls->blocks_scratch);
> > +cls->blocks_scratch = xmalloc(sizeof(uint64_t) * NETDEV_MAX_BURST *
> > +  blocks_required_per_pkt);
> 
> For sizeof above I don't think you need the () as it's part of an
> expression. See OVS code standard below. This may apply to the xmalloc
> for mf_masks above as well.
> 
> The sizeof operator is unique among C operators in that it accepts two
> very different kinds of operands: an expression or a type. In general,
> prefer to specify an expression, e.g. int *x = xmalloc(sizeof *x);. When
> the operand of sizeof is an expression, there is no need to parenthesize
> that operand, and please don't.

Hmmm, I get compile failures here with "sizeof uint64_t" instead of 
"sizeof(uint64_t)".
There's also some 

Re: [ovs-dev] [PATCH v10 4/5] dpif-netdev: refactor generic implementation

2019-07-12 Thread Van Haaren, Harry
> -Original Message-
> From: Ilya Maximets [mailto:i.maxim...@samsung.com]
> Sent: Friday, July 12, 2019 3:43 PM
> To: Stokes, Ian ; Van Haaren, Harry
> ; d...@openvswitch.org
> Cc: malvika.gu...@arm.com
> Subject: Re: [PATCH v10 4/5] dpif-netdev: refactor generic implementation



> >> +
> >> +    /* Hash the now linearized blocks of packet metadata */
> >> +    ULLONG_FOR_EACH_1(i, keys_map) {
> >> + uint32_t hash = 0;
> >> + uint32_t i_off = i * bit_count_total;
> >> + for (int h = 0; h < bit_count_total; h++) {
> >> + hash = hash_add64(hash, blocks_scratch[i_off + h]);
> >> + }
> >> + hashes[i] = hash_finish(hash, bit_count_total * 8);
> >
> > Can we replace magic 8 above.
> 
> Another question.
> Can we use:
> 
> ULLONG_FOR_EACH_1 (i, keys_map) {
>  hashes[i] = hash_words64_inline(_scratch[i * bit_count_total],
>  bit_count_total, 0);
> }
> ?
> There should be same effect.

Yep we can.

I did investigate hash_bytes() but this becomes a function call (unless inlined 
by LTO)
and the code the is less performant than the above loop due to handling non 8 
byte blocks.

But I missed  hash_words64_inline()  - learning every day :)

Cheers

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v10 4/5] dpif-netdev: refactor generic implementation

2019-07-12 Thread Ilya Maximets
On 12.07.2019 17:19, Ian Stokes wrote:
> On 7/9/2019 1:34 PM, Harry van Haaren wrote:
>> This commit refactors the generic implementation. The
>> goal of this refactor is to simply the code to enable
> 'simply' -> 'simplify'?
> 
>> "specialization" of the functions at compile time.
>>
>> Given compile-time optimizations, the compiler is able
>> to unroll loops, and create optimized code sequences due
>> to compile time knowledge of loop-trip counts.
>>
>> In order to enable these compiler optimizations, we must
>> refactor the code to pass the loop-trip counts to functions
>> as compile time constants.
>>
>> This patch allows the number of miniflow-bits set per "unit"
>> in the miniflow to be passed around as a function argument.
>>
>> Note that this patch does NOT yet take advantage of doing so,
>> this is only a refactor to enable it in the next patches.
>>
>> Signed-off-by: Harry van Haaren 
>> Tested-by: Malvika Gupta 
>>
> 
> Thanks for the v10 Harry, some comments below.
> 
>> ---
>>
>> v10:
>> - Rebase updates from previous patches
>> - Fix whitespace indentation of func params
>> - Removed restrict keyword, Windows CI failing when it is used (Ian)
>> - Fix integer 0 used to set NULL pointer (Ilya)
>> - Postpone free() call on cls->blocks_scratch (Ilya)
>> - Fix indentation of a function
>>
>> v9:
>> - Use count_1bits in favour of __builtin_popcount (Ilya)
>> - Use ALWAYS_INLINE instead of __attribute__ synatx (Ilya)
>>
>> v8:
>> - Rework block_cache and mf_masks to avoid variable-lenght array
>>    due to compiler issues. Provisioning for worst case is not a
>>    good solution due to magnitue of over-provisioning required.
>> - Rework netdev_flatten function removing unused parameter
>> ---
>>   lib/dpif-netdev-lookup-generic.c | 239 ---
>>   lib/dpif-netdev.c    |  77 +-
>>   lib/dpif-netdev.h    |  18 +++
>>   3 files changed, 281 insertions(+), 53 deletions(-)
>>
>> diff --git a/lib/dpif-netdev-lookup-generic.c 
>> b/lib/dpif-netdev-lookup-generic.c
>> index d49d4b570..432d8782e 100644
>> --- a/lib/dpif-netdev-lookup-generic.c
>> +++ b/lib/dpif-netdev-lookup-generic.c
>> @@ -29,67 +29,204 @@
>>   #include "packets.h"
>>   #include "pvector.h"
>>   -/* Returns a hash value for the bits of 'key' where there are 1-bits in
>> - * 'mask'. */
>> -static inline uint32_t
>> -netdev_flow_key_hash_in_mask(const struct netdev_flow_key *key,
>> - const struct netdev_flow_key *mask)
>> +VLOG_DEFINE_THIS_MODULE(dpif_lookup_generic);
>> +
>> +/* netdev_flow_key_flatten_unit:
> 
> No need to mention function name in comments. Applies to other function 
> comments in the patch also.
> 
> For function comments in OVS you can follow:
> 
> http://docs.openvswitch.org/en/latest/internals/contributing/coding-style/#functions
> 
>> + * Given a packet, table and mf_masks, this function iterates over each bit
>> + * set in the subtable, and calculates the appropriate metadata to store in 
>> the
>> + * blocks_scratch[].
>> + *
>> + * The results of the blocks_scratch[] can be used for hashing, and later 
>> for
>> + * verification of if a rule matches the given packet.
>> + */
>> +static inline void
>> +netdev_flow_key_flatten_unit(const uint64_t *pkt_blocks,
>> + const uint64_t *tbl_blocks,
>> + const uint64_t *mf_masks,
>> + uint64_t *blocks_scratch,
>> + const uint64_t pkt_mf_bits,
>> + const uint32_t count)
>>   {
>> -    const uint64_t *p = miniflow_get_values(>mf);
>> -    uint32_t hash = 0;
>> -    uint64_t value;
>> +    uint32_t i;
> New line just to separate variable declaration form function body.
> 
>> +    for (i = 0; i < count; i++) {
>> +    uint64_t mf_mask = mf_masks[i];
>> +    /* Calculate the block index for the packet metadata */
>> +    uint64_t idx_bits = mf_mask & pkt_mf_bits;
>> +    const uint32_t pkt_idx = count_1bits(idx_bits);
>>   -    NETDEV_FLOW_KEY_FOR_EACH_IN_FLOWMAP (value, key, mask->mf.map) {
>> -    hash = hash_add64(hash, value & *p);
>> -    p++;
>> +    /* check if the packet has the subtable miniflow bit set. If yes, 
>> the
> 
> For above and for other comments in the patch, capitalization at beginning, 
> period to finish.
> 
>> + * block at the above pkt_idx will be stored, otherwise it is masked
>> + * out to be zero.
>> + */
>> +    uint64_t pkt_has_mf_bit = (mf_mask + 1) & pkt_mf_bits;
>> +    uint64_t no_bit = ((!pkt_has_mf_bit) > 0) - 1;
>> +
>> +    /* mask packet block by table block, and mask to zero if packet
>> + * doesn't actually contain this block of metadata
>> + */
>> +    blocks_scratch[i] = pkt_blocks[pkt_idx] & tbl_blocks[i] & no_bit;
>>   }
>> +}
>> +
>> +/* netdev_flow_key_flatten:
>> + * This function takes a packet, and subtable and writes an 

Re: [ovs-dev] [PATCH v10 4/5] dpif-netdev: refactor generic implementation

2019-07-12 Thread Ian Stokes

On 7/9/2019 1:34 PM, Harry van Haaren wrote:

This commit refactors the generic implementation. The
goal of this refactor is to simply the code to enable

'simply' -> 'simplify'?


"specialization" of the functions at compile time.

Given compile-time optimizations, the compiler is able
to unroll loops, and create optimized code sequences due
to compile time knowledge of loop-trip counts.

In order to enable these compiler optimizations, we must
refactor the code to pass the loop-trip counts to functions
as compile time constants.

This patch allows the number of miniflow-bits set per "unit"
in the miniflow to be passed around as a function argument.

Note that this patch does NOT yet take advantage of doing so,
this is only a refactor to enable it in the next patches.

Signed-off-by: Harry van Haaren 
Tested-by: Malvika Gupta 



Thanks for the v10 Harry, some comments below.


---

v10:
- Rebase updates from previous patches
- Fix whitespace indentation of func params
- Removed restrict keyword, Windows CI failing when it is used (Ian)
- Fix integer 0 used to set NULL pointer (Ilya)
- Postpone free() call on cls->blocks_scratch (Ilya)
- Fix indentation of a function

v9:
- Use count_1bits in favour of __builtin_popcount (Ilya)
- Use ALWAYS_INLINE instead of __attribute__ synatx (Ilya)

v8:
- Rework block_cache and mf_masks to avoid variable-lenght array
   due to compiler issues. Provisioning for worst case is not a
   good solution due to magnitue of over-provisioning required.
- Rework netdev_flatten function removing unused parameter
---
  lib/dpif-netdev-lookup-generic.c | 239 ---
  lib/dpif-netdev.c|  77 +-
  lib/dpif-netdev.h|  18 +++
  3 files changed, 281 insertions(+), 53 deletions(-)

diff --git a/lib/dpif-netdev-lookup-generic.c b/lib/dpif-netdev-lookup-generic.c
index d49d4b570..432d8782e 100644
--- a/lib/dpif-netdev-lookup-generic.c
+++ b/lib/dpif-netdev-lookup-generic.c
@@ -29,67 +29,204 @@
  #include "packets.h"
  #include "pvector.h"
  
-/* Returns a hash value for the bits of 'key' where there are 1-bits in

- * 'mask'. */
-static inline uint32_t
-netdev_flow_key_hash_in_mask(const struct netdev_flow_key *key,
- const struct netdev_flow_key *mask)
+VLOG_DEFINE_THIS_MODULE(dpif_lookup_generic);
+
+/* netdev_flow_key_flatten_unit:


No need to mention function name in comments. Applies to other function 
comments in the patch also.


For function comments in OVS you can follow:

http://docs.openvswitch.org/en/latest/internals/contributing/coding-style/#functions


+ * Given a packet, table and mf_masks, this function iterates over each bit
+ * set in the subtable, and calculates the appropriate metadata to store in the
+ * blocks_scratch[].
+ *
+ * The results of the blocks_scratch[] can be used for hashing, and later for
+ * verification of if a rule matches the given packet.
+ */
+static inline void
+netdev_flow_key_flatten_unit(const uint64_t *pkt_blocks,
+ const uint64_t *tbl_blocks,
+ const uint64_t *mf_masks,
+ uint64_t *blocks_scratch,
+ const uint64_t pkt_mf_bits,
+ const uint32_t count)
  {
-const uint64_t *p = miniflow_get_values(>mf);
-uint32_t hash = 0;
-uint64_t value;
+uint32_t i;

New line just to separate variable declaration form function body.


+for (i = 0; i < count; i++) {
+uint64_t mf_mask = mf_masks[i];
+/* Calculate the block index for the packet metadata */
+uint64_t idx_bits = mf_mask & pkt_mf_bits;
+const uint32_t pkt_idx = count_1bits(idx_bits);
  
-NETDEV_FLOW_KEY_FOR_EACH_IN_FLOWMAP (value, key, mask->mf.map) {

-hash = hash_add64(hash, value & *p);
-p++;
+/* check if the packet has the subtable miniflow bit set. If yes, the


For above and for other comments in the patch, capitalization at 
beginning, period to finish.



+ * block at the above pkt_idx will be stored, otherwise it is masked
+ * out to be zero.
+ */
+uint64_t pkt_has_mf_bit = (mf_mask + 1) & pkt_mf_bits;
+uint64_t no_bit = ((!pkt_has_mf_bit) > 0) - 1;
+
+/* mask packet block by table block, and mask to zero if packet
+ * doesn't actually contain this block of metadata
+ */
+blocks_scratch[i] = pkt_blocks[pkt_idx] & tbl_blocks[i] & no_bit;
  }
+}
+
+/* netdev_flow_key_flatten:
+ * This function takes a packet, and subtable and writes an array of uint64_t
+ * blocks. The blocks contain the metadata that the subtable matches on, in
+ * the same order as the subtable, allowing linear iteration over the blocks.
+ *
+ * To calculate the blocks contents, the netdev_flow_key_flatten_unit function
+ * is called twice, once for each "unit" of the miniflow. This call can be
+ * inlined by the compiler for performance.