Re: SMP steroids for PF

2015-06-26 Thread mxb


And the rest of us is watching and waiting for diffs to apply :)
It is like exiting movie - I have popcorn in front.
Culmination must be soon, I guess :)

//mxb

On 2015-06-26 19:09, Martin Pieuchot wrote:

On 26/06/15(Fri) 17:19, Alexandr Nedvedicky wrote:

On Fri, Jun 26, 2015 at 04:34:06PM +0200, Martin Pieuchot wrote:

On 26/06/15(Fri) 16:00, Alexandr Nedvedicky wrote:

2)  I saw that you found some ALTQ leftovers, you have some Solaris

(2) I think ALTQs leftovers are still in CVS repo, will double check
anyway. Stack alignment is not Solaris compatibility hack it's sparc
compatibility. May be your C compiler takes care of this and grants
16/32/64 bit stack alignment. I have not examined build process
that closely yet.

By "Solaris compatibility" I'm referring to the size of ``sa_family_t''
and the corresponding changes in "struct pfr_table".


I see. sa_family_t is kind of surprise it's defined as uint16_t on Solaris.
PF at various places mixes sa_family_t with u_int8_t, so all af variables
on Solaris had to be turned to sa_family_t. Some of those changes leaked
backed during merge to current.

Even if on OpenBSD sa_family_t is defined as uint8_t, I'd argue that
for portability reasons we should use the correct type where it is
appropriate.  And here your "leak" is a good example of portability.

If you have some changes that could improve the portability of the
software and if you feel like sharing them, I'd be interested.

Regards,
Martin





Re: SMP steroids for PF

2015-06-26 Thread Martin Pieuchot
On 26/06/15(Fri) 17:19, Alexandr Nedvedicky wrote:
> On Fri, Jun 26, 2015 at 04:34:06PM +0200, Martin Pieuchot wrote:
> > On 26/06/15(Fri) 16:00, Alexandr Nedvedicky wrote:
> > > > 2)  I saw that you found some ALTQ leftovers, you have some Solaris
> > > (2) I think ALTQs leftovers are still in CVS repo, will double check
> > > anyway. Stack alignment is not Solaris compatibility hack it's sparc
> > > compatibility. May be your C compiler takes care of this and grants
> > > 16/32/64 bit stack alignment. I have not examined build process
> > > that closely yet.
> > 
> > By "Solaris compatibility" I'm referring to the size of ``sa_family_t''
> > and the corresponding changes in "struct pfr_table".
> > 
> I see. sa_family_t is kind of surprise it's defined as uint16_t on Solaris.
> PF at various places mixes sa_family_t with u_int8_t, so all af variables
> on Solaris had to be turned to sa_family_t. Some of those changes leaked
> backed during merge to current.

Even if on OpenBSD sa_family_t is defined as uint8_t, I'd argue that
for portability reasons we should use the correct type where it is
appropriate.  And here your "leak" is a good example of portability.

If you have some changes that could improve the portability of the
software and if you feel like sharing them, I'd be interested.

Regards,
Martin



Re: SMP steroids for PF

2015-06-26 Thread Alexandr Nedvedicky
On Fri, Jun 26, 2015 at 04:34:06PM +0200, Martin Pieuchot wrote:
> On 26/06/15(Fri) 16:00, Alexandr Nedvedicky wrote:
> > Hello Martin,
> > 
> > I accept or your comments. I just have few quick notes/questions now.
> > 
> > > 2)  I saw that you found some ALTQ leftovers, you have some Solaris
> > (2) I think ALTQs leftovers are still in CVS repo, will double check
> > anyway. Stack alignment is not Solaris compatibility hack it's sparc
> > compatibility. May be your C compiler takes care of this and grants
> > 16/32/64 bit stack alignment. I have not examined build process
> > that closely yet.
> 
> By "Solaris compatibility" I'm referring to the size of ``sa_family_t''
> and the corresponding changes in "struct pfr_table".
> 
I see. sa_family_t is kind of surprise it's defined as uint16_t on Solaris.
PF at various places mixes sa_family_t with u_int8_t, so all af variables
on Solaris had to be turned to sa_family_t. Some of those changes leaked
backed during merge to current.

> > (3)
> > > use atomic operations rather than per-CPU counters or any other
> > > solution?  I'm also raising this question because some counters are
> > can you point me to manual page or source code sample so I can have a look 
> > how
> > to use per-CPU counter?
> 
> There's no such manual.  I was more asking about the reason for using
> atomic operations.  Is it because you're trying to use existing APIs?
I'm taking what's available.

> > > 5)  I'm not sure to understand the goal of the new "pf_refcnt_t" type
> > (5) Solaris defines pf_refcnt_t as 64-bit unsigned integer, pf_refcnt_t 
> > hopes
> > to make porting easier. It can be defined as 32-bit on 32-bit machines.
> 
> Using a long on OpenBSD will grantee that the value fits in a register,
> so it should be fine.
O.K. thanks for clarification.

> 
> > > 7)  The PF_SMP_INSERT_WQ() macro to replace SLIST_INSERT() seems over-
> > PF_SMP_INSERT_WQ() purpose of those is to allow every CPU/thread to
> > operate on its own work-queue of ktables/kentries. The current pf
> > uses 'intrusive' link members pfrkt_workq/pfrke_workq in 
> > pfr_ktable/pfr_kentry.
> > The only idea is to stay as much close to current version as possible.
> 
> I understand that you want to stay close to the current version.  I'm
> just saying that we can also modify the current version to reduce the
> size of your diff.

I understand. I'll try to restructure the patches to make them easier
for review.
 
> 
> Regards,
> Martin



Re: SMP steroids for PF

2015-06-26 Thread Martin Pieuchot
On 26/06/15(Fri) 16:00, Alexandr Nedvedicky wrote:
> Hello Martin,
> 
> I accept or your comments. I just have few quick notes/questions now.
> 
> > 2)  I saw that you found some ALTQ leftovers, you have some Solaris
> (2) I think ALTQs leftovers are still in CVS repo, will double check
> anyway. Stack alignment is not Solaris compatibility hack it's sparc
> compatibility. May be your C compiler takes care of this and grants
> 16/32/64 bit stack alignment. I have not examined build process
> that closely yet.

By "Solaris compatibility" I'm referring to the size of ``sa_family_t''
and the corresponding changes in "struct pfr_table".

> (3)
> > use atomic operations rather than per-CPU counters or any other
> > solution?  I'm also raising this question because some counters are
> can you point me to manual page or source code sample so I can have a look how
> to use per-CPU counter?

There's no such manual.  I was more asking about the reason for using
atomic operations.  Is it because you're trying to use existing APIs?
Are the macros implemented differently in Solaris?

> > 5)  I'm not sure to understand the goal of the new "pf_refcnt_t" type
> (5) Solaris defines pf_refcnt_t as 64-bit unsigned integer, pf_refcnt_t hopes
> to make porting easier. It can be defined as 32-bit on 32-bit machines.

Using a long on OpenBSD will grantee that the value fits in a register,
so it should be fine.

> > 7)  The PF_SMP_INSERT_WQ() macro to replace SLIST_INSERT() seems over-
> PF_SMP_INSERT_WQ() purpose of those is to allow every CPU/thread to
> operate on its own work-queue of ktables/kentries. The current pf
> uses 'intrusive' link members pfrkt_workq/pfrke_workq in 
> pfr_ktable/pfr_kentry.
> The only idea is to stay as much close to current version as possible.

I understand that you want to stay close to the current version.  I'm
just saying that we can also modify the current version to reduce the
size of your diff.

Regards,
Martin



Re: SMP steroids for PF

2015-06-26 Thread Alexandr Nedvedicky
Hello Martin,

I accept or your comments. I just have few quick notes/questions now.

> 2)  I saw that you found some ALTQ leftovers, you have some Solaris
(2) I think ALTQs leftovers are still in CVS repo, will double check
anyway. Stack alignment is not Solaris compatibility hack it's sparc
compatibility. May be your C compiler takes care of this and grants
16/32/64 bit stack alignment. I have not examined build process
that closely yet.

(3)
> use atomic operations rather than per-CPU counters or any other
> solution?  I'm also raising this question because some counters are
can you point me to manual page or source code sample so I can have a look how
to use per-CPU counter?

> 5)  I'm not sure to understand the goal of the new "pf_refcnt_t" type
(5) Solaris defines pf_refcnt_t as 64-bit unsigned integer, pf_refcnt_t hopes
to make porting easier. It can be defined as 32-bit on 32-bit machines.

> 7)  The PF_SMP_INSERT_WQ() macro to replace SLIST_INSERT() seems over-
PF_SMP_INSERT_WQ() purpose of those is to allow every CPU/thread to
operate on its own work-queue of ktables/kentries. The current pf
uses 'intrusive' link members pfrkt_workq/pfrke_workq in pfr_ktable/pfr_kentry.
The only idea is to stay as much close to current version as possible.


I'll try to break the patch into smaller chunks of changes. And post them.

regards
sasha




On Fri, Jun 26, 2015 at 02:36:38PM +0200, Martin Pieuchot wrote:
> Hello Sasha,
> 
> Alexandr Nedvedicky [alexandr.nedvedi...@oracle.com] wrote:
> > Hello,
> > 
> > attached is SMP patch for PF. consider it as toxic proof of concept as it 
> > has
> > paniced my amd64 system (see attached phone-shot). I have to figure out how 
> > to
> > debug it yet. The problem is the USB keyboard has died, so I had no chance 
> > to
> > type anything.  fortunately the issue is 100% reproducible.
> > 
> > The patch compiles in .MP and non-MP version. As you'll see more work is
> > needed to stabilize it and get full SMP support of PF. Those PF
> > features are not covered by SMP changes:
> > - packet queues
> > - packet logging
> > - pf-sync
> 
> This is an impressive diff, wow!  I started to look at it and my first
> impression is that it is too big.  You should really try to split it
> in smaller pieces to get proper reviews.
> 
> Anyway here are some comments about splitting/cleaning this diff, I'll
> need more time to be really able to comment on your work.  I'd just
> want to say "wow" again.  This is amazing!
> 
> 1)  Your diff includes a lot of "cleanups" which are not directly
> related to your SMP work.  By cleanups I'm talking about the
> FALLTHROUGH, "#ifdef", comments, (void), etc that your added in
> various places.  I'd suggest submitting a first diff including
> all these cleanups.  It can be easily reviewed and committed and
> this will reduce the noise of your SMP work (and size of the diff).
> 
> 2)  I saw that you found some ALTQ leftovers, you have some Solaris
> compatibility goo, stack alignment tricks or when sometimes you
> need to return a variable to (de)reference it (ie pf_get_sport).
> These could also be single-shot easy-to-review diffs.
> 
> 3)  A lot of chunks in your diff are related to counter modifications.
> This could be a diff in itself.  I'm a bit afraid by the number of
> different macro to deal with counters.  Then why did you choose to
> use atomic operations rather than per-CPU counters or any other
> solution?  I'm also raising this question because some counters are
> 64bit long and there's no atomic operation to modify such value on
> 32bit architectures.
> 
> 4)  Regarding reference counting around pool-allocated object, I'd
> subject to wrap the pool_{get,put} into their own function this
> would greatly reduce the "#ifdef _PF_SMP_/#else" dances like:
> 
>   +#ifdef _PF_SMP_
>   +   pf_rule_smp_rele(rule);
>   +#else  /* !_PF_SMP_ */
>   pool_put(&pf_rule_pl, rule);
>   +#endif /* !_PF_SMP_ */
>   
> 
> 5)  I'm not sure to understand the goal of the new "pf_refcnt_t" type
> but using a long (why not unsigned?) makes sense with regards to
> atomic operations.  Note however that your comment describing it
> is incorrect.  I'd rather delete the comment.  It's a good
> explanation for an email but the intend is quite obvious.
> 
> 6)  In pf_osfp.c rather than changing the signature of some functions
> in the _PF_SMP_ case only, I'd suggest to adapt the existing code.
> Having fewer "#ifdef _PF_SMP_/#else" makes it easier to understand,
> review and work with the code 8)  This comment also applies to
> pfr_pool_get().
> 
> 7)  The PF_SMP_INSERT_WQ() macro to replace SLIST_INSERT() seems over-
> generic to me.  Do you plan to use it with a different allocator?
> Can't we use it for the SP version of pf_table too or at least 
> create a macro/function that behave differentl

Re: SMP steroids for PF

2015-06-26 Thread Martin Pieuchot
Hello Sasha,

Alexandr Nedvedicky [alexandr.nedvedi...@oracle.com] wrote:
> Hello,
> 
> attached is SMP patch for PF. consider it as toxic proof of concept as it has
> paniced my amd64 system (see attached phone-shot). I have to figure out how to
> debug it yet. The problem is the USB keyboard has died, so I had no chance to
> type anything.  fortunately the issue is 100% reproducible.
> 
> The patch compiles in .MP and non-MP version. As you'll see more work is
> needed to stabilize it and get full SMP support of PF. Those PF
> features are not covered by SMP changes:
>   - packet queues
>   - packet logging
>   - pf-sync

This is an impressive diff, wow!  I started to look at it and my first
impression is that it is too big.  You should really try to split it
in smaller pieces to get proper reviews.

Anyway here are some comments about splitting/cleaning this diff, I'll
need more time to be really able to comment on your work.  I'd just
want to say "wow" again.  This is amazing!

1)  Your diff includes a lot of "cleanups" which are not directly
related to your SMP work.  By cleanups I'm talking about the
FALLTHROUGH, "#ifdef", comments, (void), etc that your added in
various places.  I'd suggest submitting a first diff including
all these cleanups.  It can be easily reviewed and committed and
this will reduce the noise of your SMP work (and size of the diff).

2)  I saw that you found some ALTQ leftovers, you have some Solaris
compatibility goo, stack alignment tricks or when sometimes you
need to return a variable to (de)reference it (ie pf_get_sport).
These could also be single-shot easy-to-review diffs.

3)  A lot of chunks in your diff are related to counter modifications.
This could be a diff in itself.  I'm a bit afraid by the number of
different macro to deal with counters.  Then why did you choose to
use atomic operations rather than per-CPU counters or any other
solution?  I'm also raising this question because some counters are
64bit long and there's no atomic operation to modify such value on
32bit architectures.

4)  Regarding reference counting around pool-allocated object, I'd
subject to wrap the pool_{get,put} into their own function this
would greatly reduce the "#ifdef _PF_SMP_/#else" dances like:

+#ifdef _PF_SMP_
+   pf_rule_smp_rele(rule);
+#else  /* !_PF_SMP_ */
pool_put(&pf_rule_pl, rule);
+#endif /* !_PF_SMP_ */


5)  I'm not sure to understand the goal of the new "pf_refcnt_t" type
but using a long (why not unsigned?) makes sense with regards to
atomic operations.  Note however that your comment describing it
is incorrect.  I'd rather delete the comment.  It's a good
explanation for an email but the intend is quite obvious.

6)  In pf_osfp.c rather than changing the signature of some functions
in the _PF_SMP_ case only, I'd suggest to adapt the existing code.
Having fewer "#ifdef _PF_SMP_/#else" makes it easier to understand,
review and work with the code 8)  This comment also applies to
pfr_pool_get().

7)  The PF_SMP_INSERT_WQ() macro to replace SLIST_INSERT() seems over-
generic to me.  Do you plan to use it with a different allocator?
Can't we use it for the SP version of pf_table too or at least 
create a macro/function that behave differently for the SMP version
to reduce the "#ifdef" dances...

8)  Your protection of the pfi_ifhead RB-tree principally correspond to
the existing splsoftnet(), don't you think it would make sense to
use a single macro for the IPL and on SMP rwlock?

Cheers,
Martin



Re: SMP steroids for PF

2015-06-26 Thread Alexandr Nedvedicky
Sorry, fingers were faster than head (again...)

regards
sasha

On Thu, Jun 25, 2015 at 01:39:29PM -0700, Chris Cappuccio wrote:
> You should re-post as a diff -u !!
> 
> Alexandr Nedvedicky [alexandr.nedvedi...@oracle.com] wrote:
> > Hello,
> > 
> > attached is SMP patch for PF. consider it as toxic proof of concept as it 
> > has
> > paniced my amd64 system (see attached phone-shot). I have to figure out how 
> > to
> > debug it yet. The problem is the USB keyboard has died, so I had no chance 
> > to
> > type anything.  fortunately the issue is 100% reproducible.
> > 
> > The patch compiles in .MP and non-MP version. As you'll see more work is
> > needed to stabilize it and get full SMP support of PF. Those PF
> > features are not covered by SMP changes:
> > - packet queues
> > - packet logging
> > - pf-sync
> > 
> > patch is attached.
> > 
> > regards
> > sasha
> 
> 
> 
> -- 
> The bums will always lose


pf_smp.amd64.2015-06-26.diff.gz
Description: application/gunzip


SMP steroids for PF

2015-06-25 Thread Alexandr Nedvedicky
Hello,

attached is SMP patch for PF. consider it as toxic proof of concept as it has
paniced my amd64 system (see attached phone-shot). I have to figure out how to
debug it yet. The problem is the USB keyboard has died, so I had no chance to
type anything.  fortunately the issue is 100% reproducible.

The patch compiles in .MP and non-MP version. As you'll see more work is
needed to stabilize it and get full SMP support of PF. Those PF
features are not covered by SMP changes:
- packet queues
- packet logging
- pf-sync

patch is attached.

regards
sasha


pf_smp.amd64.2015-06-25.diff.gz
Description: application/gunzip