Re: FWSync driver about IPFW synchronization

2022-08-31 Thread Alexander V. Chernikov



> On 27 Aug 2022, at 15:48, Michael Pounov  wrote:
> 
> Hello all
> 
> I want to propose one new feature about IPFW. FWSync driver exchange dynamic 
> state and aliase records between routers.
> If you have interest about such feature to be implemented at FreeBSD code 
> base. You are fill free to get it.
> 
> I will be glad also for dicussion about current FWSync driver.
> 
> There is a help, how it can be install and patch OS code base to have 
> connection between them
> http://www.elwix.org/site/documentation/fwsync-document/
Kernel-space syncer which is able to sync nat states is a pretty good approach.
Mind putting it on phabricator? The current format is not particularly easy to 
review/integrate.

> 
> Best Regards
> Michael Pounov  
> 
> 




Re: removing some error states

2018-05-03 Thread Alexander V . Chernikov
02.05.2018, 06:32, "Julian Elischer" :
> On 2/5/18 1:05 am, Julian Elischer wrote:
>>  On 1/5/18 11:03 pm, Rodney W. Grimes wrote:
  Many years ago I added code to ipfw so that if -q was set it would
  not
  complain about
  things that were unimportant, nor would it return an error code.
  Such things include removing table entries that are already gone and
  similar sorts of 'safe' operations.
  The idea is that you can write 'naive' scripts that don't need to do
  complicated checks to see if XXX is already present or gone..
  In hte ame way that rm -f doesn't complain if the file doesn't
  exist..? You were going to delete it anyhow.

  I'd like that to continue to some of the new additions.
  for example the terribly annoying
    ??? ipfw: DEPRECATED: inserting data into non-existent table 18.
  (auto-created) (who cares?)

  and

    ?? ljcc-78# ipfw table 19 create
     ipfw: Table creation failed: File exists

  As the script needs to run multiple times, I don't care if the table
  already exists.
  but I do care about other errors.
  I don't want to have to write special wrapper code for table create
  that is different
  from the wrappers elsewhere because it has to look for return code 71
  and disregard it.
  Can we just have -q continue to ignore such errors please?
>>>  I think there is a bigger question here, why was auto table creation
>>>  with first insert "Deprecated" at all?   This to me just seems like
>>>  change cause someone could change it that has no usefull purpose or
>>>  is there some great purpose this serves?
In the "old" world we had single type of tables, each of them name by numbers 
from 1.. ip.fw.tables_max range. If the table number was less than 
ip.fw.tables_max - it automatically existed. If not, one had to increase 
ip.fw.tables_max.
In the new world, number of different table types / configuration is pretty 
large. Additionally, one can use string names as table identifiers.
In most cases, user wants to have "old" kind of table with default values, but 
it might not always be the case. Let's suppose one wants to have special kind 
of table, but forgots to create it. Then, silently creating default table upon 
insertion will simply hide an error leading to undesired hard-to-dianose 
behaviour on later stages.
 
>>>
>>>  Same with creation of an already existing file, why did that need
>>>  to become a noisy warning/error?
>>  Well ther eis an argument (that I disagree with in this case) that
>>  any unexected even is an error..
>>
>>  Also the new tables can have many different key type and indexing
>>  algorithms, which need to be  declared up front.
>>
>>  but I don't see that raising a fatal error for trying to delete
>>  something that doesn't exist or make something that already exists
>>  really helps much other than to make scripts more complicated.
>>  That's why I made it optional before.. Removing table entries that
>>  are not present could be an error you want to know about, but
>>  probably it isn't.
The question here is what is the desired level of "smartness" ipfw(8) should 
implement. Traditionally it was a pretty low-level tool, used by other 
scripts/tools to manipulate the ruleset. In that case then generic idea is that 
we should propagate the error, so one can handle it properly in the upper layer.
>
> my biggest issue is that it bombs out when you are using it as a filter.
> e.g. (manual simulation)
I understand your frustration. The original changeset for the named tables was 
pretty big and I was mostly focused on fixing the kernel and providing backward 
compatibility, so some of the ipfw(8) stuff were not though thoughtfully. Let's 
discuss/agree on the desired behaviour and fix it. 
>
> 32Ssd# ipfw -q -f /dev/tty    < -q   "don't complain and quit on
> unimportant things  -f  "trust me I know what I'm doing"
I always thought of "-q" meaning as mostly "Be quiet when executing ..." as 
ipfw(8) manual states, which is different from the generic "ignore errors".
I totally agree that having some way of saying "ignore the error and continue" 
is a thing we should have. Probably, adding another option for that would be an 
overkill, so we indeed can rebrand "-q".
> table 3 add 1.1.1.1
> table 3 add 1.1.1.1   <- no error.. this is what I want..
> table 20 add 2.2.2.2
> table 3 swap 20
> table all list
> --- table(3), set(0) ---
> 2.2.2.2/32 0
> --- table(20), set(0) ---
> 1.1.1.1/32 0
> table 3 swap 21  <--  doesn't quit, but doesn't generate a new
> empty 21 either :-(
What is the "proper" behaviour from your pov here? (with and without -q)? 
> table all list
> --- table(3), set(0) ---
> 2.2.2.2/32 0
> --- table(20), set(0) ---
> 1.1.1.1/32 0
> table 21 create
> table 21 create  <-- this shouldn't quit..   actually I
This shouldn't quit IFF "-q" is supplied and the already-created table is 

Re: panic: refcount inconsistency: found: 0 total: 1

2015-11-03 Thread Alexander V . Chernikov
03.11.2015, 17:05, "David Wolfskill" :
> This was on my laptop; yesterday, it built & booted:
>
> FreeBSD g1-252.catwhisker.org 11.0-CURRENT FreeBSD 11.0-CURRENT #230 
> r290270M/290270:1100085: Mon Nov 2 05:03:07 PST 2015 
> r...@g1-252.catwhisker.org:/common/S4/obj/usr/src/sys/CANARY amd64
>
> OK; today, after building:
>
> FreeBSD localhost 11.0-CURRENT FreeBSD 11.0-CURRENT #231 
> r290334M/290334:1100086: Tue Nov 3 04:51:24 PST 2015 
> r...@g1-252.catwhisker.org:/common/S4/obj/usr/src/sys/CANARY amd64
>
> I tried booting it, and during the transition to multi-user mode,
> once ipfw was being invoked, I got the above-cited panic. Circumvention
> was to leave it disconnected from a network (turn off the WiFi
> switch, in my case), so we don't get a chance to use the network.
It is most probably related with r290334. Would you mind reverting it and 
checking if ipfw works correctly ?

>
> I was able to get a dump by explicitly typing "call doadump" -- an
> earlier attempt at "panic" didn't capture one. Stack trace:
>
> #0 doadump (textdump=0) at pcpu.h:221
> 221 pcpu.h: No such file or directory.
> in pcpu.h
> (kgdb) #0 doadump (textdump=0) at pcpu.h:221
> #1 0x8037b6b6 in db_fncall (dummy1=,
> dummy2=, dummy3=,
> dummy4=) at /usr/src/sys/ddb/db_command.c:568
> #2 0x8037b14e in db_command (cmd_table=0x0)
> at /usr/src/sys/ddb/db_command.c:440
> #3 0x8037aee4 in db_command_loop ()
> at /usr/src/sys/ddb/db_command.c:493
> #4 0x8037d97b in db_trap (type=, code=0)
> at /usr/src/sys/ddb/db_main.c:251
> #5 0x80a270f3 in kdb_trap (type=3, code=0, tf=)
> at /usr/src/sys/kern/subr_kdb.c:654
> #6 0x80db6668 in trap (frame=0xfe060bdde1d0)
> at /usr/src/sys/amd64/amd64/trap.c:549
> #7 0x80d961f7 in calltrap ()
> at /usr/src/sys/amd64/amd64/exception.S:234
> #8 0x80a267db in kdb_enter (why=0x812a5566 "panic",
> msg=0x80 ) at cpufunc.h:63
> #9 0x809ea01f in vpanic (fmt=,
> ap=) at /usr/src/sys/kern/kern_shutdown.c:750
> #10 0x809e9e76 in kassert_panic (fmt=)
> at /usr/src/sys/kern/kern_shutdown.c:647
> #11 0x80c2a788 in ipfw_rewrite_rule_uidx (chain=0x81be5310,
> ci=0xfe060bdde4b8) at /usr/src/sys/netpfil/ipfw/ip_fw_table.c:3395
> #12 0x80c267c3 in commit_rules (chain=0x81be5310,
> rci=0xfe060bdde4b8, count=1)
> at /usr/src/sys/netpfil/ipfw/ip_fw_sockopt.c:678
> #13 0x80c25d80 in add_rules (chain=0x81be5310,
> op3=, sd=)
> at /usr/src/sys/netpfil/ipfw/ip_fw_sockopt.c:2594
> #14 0x80c232f4 in ipfw_ctl3 (sopt=0xfe060bdde920)
> at /usr/src/sys/netpfil/ipfw/ip_fw_sockopt.c:3242
> #15 0x80b3d8b1 in rip_ctloutput (so=,
> sopt=0xfe060bdde920) at /usr/src/sys/netinet/raw_ip.c:588
> #16 0x80a72bc6 in sogetopt (so=0xf80009e658b8,
> sopt=0xfe060bdde920) at /usr/src/sys/kern/uipc_socket.c:2731
> #17 0x80a7729e in kern_getsockopt (td=0xf800098119a0,
> s=, level=,
> name=, val=, valseg=464,
> valsize=0xfe060bdde98c) at /usr/src/sys/kern/uipc_syscalls.c:1540
> #18 0x80a771a0 in sys_getsockopt (td=0xf800098119a0,
> uap=0xfe060bddea40) at /usr/src/sys/kern/uipc_syscalls.c:1486
> #19 0x80db7519 in amd64_syscall (td=0xf800098119a0, traced=0)
> at subr_syscall.c:140
> #20 0x80d964db in Xfast_syscall ()
> at /usr/src/sys/amd64/amd64/exception.S:394
> #21 0x000800b2cbea in ?? ()
> Previous frame inner to this frame (corrupt stack?)
> Current language: auto; currently minimal
> (kgdb)
>
> I've copied the vmcore.z & core.txt.7 to
> ; gzipped
> copies are also available:
>
> Index of /~david/FreeBSD/head/ipfw
>
>  Icon Name Last modified Size Description
>   _
>  [PARENTDIR] Parent Directory -
>  [TXT] core.txt.7 2015-11-03 05:22 155K
>  [ ] core.txt.7.gz 2015-11-03 05:22 35K
>  [ ] vmcore.7 2015-11-03 05:22 528M
>  [ ] vmcore.7.gz 2015-11-03 05:22 45M
>   _
>
> I'll start taking a closer look at recent changes (e.g., in
> src/sys/netpfil/ipfw), but I'm not really all that familiar with
> the code.
>
> Peace,
> david
> --
> David H. Wolfskill da...@catwhisker.org
> Those who would murder in the name of God or prophet are blasphemous cowards.
>
> See http://www.catwhisker.org/~david/publickey.gpg for my public key.
___
freebsd-ipfw@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-ipfw
To unsubscribe, send any mail to "freebsd-ipfw-unsubscr...@freebsd.org"

Re: ipfw delete 100-300

2015-08-13 Thread Alexander V . Chernikov


13.08.2015, 18:19, Julian Elischer jul...@freebsd.org:
 On 8/13/15 10:41 PM, Ian Smith wrote:
  On Thu, 13 Aug 2015 16:30:15 +0200, Luigi Rizzo wrote:
     On Thu, Aug 13, 2015 at 4:00 PM, Ian Smith smi...@nimnet.asn.au wrote:
      On Thu, 13 Aug 2015 12:24:31 +0800, Julian Elischer wrote:
       BTW, any ideas as to what causes this?
       # ipfw show
       [...]
       00400 0 0 deny ip from 10.12.1.0/24 to any in recv
       xn0
       00500 0 16045693110842147038 deny ip from 204.109.63.0/25 to any in 
 recv
       xn1
       00600 0 0 allow ip from any to any in recv xn1
       [...]
       65535 8251 16045693110842147290 deny ip from any to any
      
      
       -current as of the 5th of august
       FreeBSD vps1.elischer.org 11.0-CURRENT FreeBSD 11.0-CURRENT #1 
 r286304: Wed
       Aug 5 14:31:10 PDT 2015
       r...@vps1.elischer.org:/usr/obj/usr/src-current/sys/VPS1 i386
      
       note i386, not amd64.
     
      Assuming all digits were shown, on a wild hunch:
     
      t23% echo 'scale=20; 2^64 - 16045693110842147038' | bc
      2401050962867404578
      t23% echo 'scale=20; 2^63 - 16045693110842147038' | bc
      -6822321073987371230
     
    
     bc
     obase=16
     16045693110842147038
     DEADC0DEDEADC0DE
    
     so... somehow pointing in a bad place.

  Ah, quite so .. and rule 65535 looks like a slightly worse place.

  t23% echo 'obase=16; 16045693110842147290' | bc
  DEADC0DEDEADC1DA

 that's deadcode when it's had some packets added to it :-)

 I think our friend Mr Chernikov may have tripped up over something..
Well, I'll take a look on it when I setup an i386 vm :)
Not easy to find one these days..

  thanks, Ian
___
freebsd-ipfw@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-ipfw
To unsubscribe, send any mail to freebsd-ipfw-unsubscr...@freebsd.org

Re: ipfw delete 100-300

2015-08-13 Thread Alexander V . Chernikov
13.08.2015, 18:21, Luigi Rizzo ri...@iet.unipi.it:
 On Thu, Aug 13, 2015 at 5:18 PM, Julian Elischer jul...@freebsd.org wrote:
  On 8/13/15 10:41 PM, Ian Smith wrote:
  On Thu, 13 Aug 2015 16:30:15 +0200, Luigi Rizzo wrote:
     On Thu, Aug 13, 2015 at 4:00 PM, Ian Smith smi...@nimnet.asn.au
  wrote:
      On Thu, 13 Aug 2015 12:24:31 +0800, Julian Elischer wrote:
       BTW, any ideas as to what causes this?
       # ipfw show
       [...]
       00400 0 0 deny ip from 10.12.1.0/24 to
  any in recv
       xn0
       00500 0 16045693110842147038 deny ip from 204.109.63.0/25 to
  any in recv
       xn1
       00600 0 0 allow ip from any to any in
  recv xn1
       [...]
       65535 8251 16045693110842147290 deny ip from any to any
      
      
       -current as of the 5th of august
       FreeBSD vps1.elischer.org 11.0-CURRENT FreeBSD 11.0-CURRENT #1
  r286304: Wed
       Aug 5 14:31:10 PDT 2015
       r...@vps1.elischer.org:/usr/obj/usr/src-current/sys/VPS1 i386
      
       note i386, not amd64.
     
      Assuming all digits were shown, on a wild hunch:
     
      t23% echo 'scale=20; 2^64 - 16045693110842147038' | bc
      2401050962867404578
      t23% echo 'scale=20; 2^63 - 16045693110842147038' | bc
      -6822321073987371230
     
    
     bc
     obase=16
     16045693110842147038
     DEADC0DEDEADC0DE
    
     so... somehow pointing in a bad place.

  Ah, quite so .. and rule 65535 looks like a slightly worse place.

  t23% echo 'obase=16; 16045693110842147290' | bc
  DEADC0DEDEADC1DA

  that's deadcode when it's had some packets added to it :-)

  I think our friend Mr Chernikov may have tripped up over something..

 looks more like the counter API. The old counters were inline in the rules.
In that case we would probably have garbage in pkts counter, too.
Anyway, I'm setting up the VM to see if this is kernel or userland problem..

 cheers
 luigi

  thanks, Ian

  ___
  freebsd-ipfw@freebsd.org mailing list
  https://lists.freebsd.org/mailman/listinfo/freebsd-ipfw
  To unsubscribe, send any mail to freebsd-ipfw-unsubscr...@freebsd.org

 --
 -+---
  Prof. Luigi RIZZO, ri...@iet.unipi.it . Dip. di Ing. dell'Informazione
  http://www.iet.unipi.it/~luigi/ . Universita` di Pisa
  TEL +39-050-2217533 . via Diotisalvi 2
  Mobile +39-338-6809875 . 56122 PISA (Italy)
 -+---
___
freebsd-ipfw@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-ipfw
To unsubscribe, send any mail to freebsd-ipfw-unsubscr...@freebsd.org

Re: ipfw delete 100-300

2015-08-04 Thread Alexander V . Chernikov
03.08.2015, 22:57, Julian Elischer jul...@freebsd.org:
 On 8/3/15 10:50 PM, Alexander V. Chernikov wrote:
  03.08.2015, 17:14, Ian Smith smi...@nimnet.asn.au:
  On Mon, 3 Aug 2015 17:38:18 +0800, Julian Elischer wrote:
     my reading of the code I can see that 'ipfw delete 100-300' doesn't
     work (well I know it doesn't work, but I had thought it was a bug),
     Now I see that its just 'not supported'
  I implemented the kernel range deletion, but converted userland part as-is.
  Should work on HEAD now (r286232).

 great!
 Pitty I'm stuck working on 8.0 :-) maybe I can back-port it.
Well, I'm afraid you will have to do it in a slightly different way since there 
was no ranged delele support in kernel (so you would probably need to fetch 
the ruleset and delete matching rules one-by-one).

    
     It may be my imagination but (distant) past?

  I was surprised too; ISTR having used that before too, but I may
  misremember remembering ..
  I also had a feeling that this syntax should work (maybe because it 
 silently accepted ranged queries) but I couldn't find any presence of real 
 ranged deletion support in SVN.
  On 9.3 with rules 100-1000 in 100's, 'ipfw delete 600-800' deletes only
  600 .. without complaint, returning 0 if 600 existed. NG for scripts.

  cheers, Ian
  ___
  freebsd-ipfw@freebsd.org mailing list
  http://lists.freebsd.org/mailman/listinfo/freebsd-ipfw
  To unsubscribe, send any mail to freebsd-ipfw-unsubscr...@freebsd.org
  ___
  freebsd-ipfw@freebsd.org mailing list
  http://lists.freebsd.org/mailman/listinfo/freebsd-ipfw
  To unsubscribe, send any mail to freebsd-ipfw-unsubscr...@freebsd.org
___
freebsd-ipfw@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-ipfw
To unsubscribe, send any mail to freebsd-ipfw-unsubscr...@freebsd.org

Re: ipfw delete 100-300

2015-08-03 Thread Alexander V . Chernikov
03.08.2015, 17:14, Ian Smith smi...@nimnet.asn.au:
 On Mon, 3 Aug 2015 17:38:18 +0800, Julian Elischer wrote:
   my reading of the code I can see that 'ipfw delete 100-300' doesn't
   work (well I know it doesn't work, but I had thought it was a bug),
   Now I see that its just 'not supported'
I implemented the kernel range deletion, but converted userland part as-is.
Should work on HEAD now (r286232). 

  
   It may be my imagination but (distant) past?

 I was surprised too; ISTR having used that before too, but I may
 misremember remembering ..
I also had a feeling that this syntax should work (maybe because it silently 
accepted ranged queries) but I couldn't find any presence of real ranged 
deletion support in SVN.

 On 9.3 with rules 100-1000 in 100's, 'ipfw delete 600-800' deletes only
 600 .. without complaint, returning 0 if 600 existed. NG for scripts.

 cheers, Ian
 ___
 freebsd-ipfw@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-ipfw
 To unsubscribe, send any mail to freebsd-ipfw-unsubscr...@freebsd.org
___
freebsd-ipfw@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-ipfw
To unsubscribe, send any mail to freebsd-ipfw-unsubscr...@freebsd.org

Re: ipfw on just inbound and not outbound

2015-05-24 Thread Alexander V . Chernikov
23.05.2015, 03:58, hiren panchasara hi...@strugglingcoder.info:
 On 05/21/15 at 02:05P, hiren panchasara wrote:
  On 05/21/15 at 12:42P, hiren panchasara wrote:
  Getting back to this now to see if I can avoid ipfw on outgoing packets.

  @@ -500,7 +507,7 @@ ipfw_hook(int onoff, int pf)
  hook_func = (pf == AF_LINK) ? ipfw_check_frame : ipfw_check_packet;

  (void) (onoff ? pfil_add_hook : pfil_remove_hook)
  -   (hook_func, NULL, PFIL_IN | PFIL_OUT | PFIL_WAITOK, pfh);
  +   (hook_func, NULL, PFIL_IN | PFIL_WAITOK, pfh);

  return 0;
  }

  Should this do the right thing? I'll report back once I test this patch.
  I am still seeing ipfw_chk() getting called in my iperf test. Now, if I
  also remove PFIL_IN, i.e if I do:
  -   (hook_func, NULL, PFIL_IN | PFIL_OUT | PFIL_WAITOK, pfh);
  +   (hook_func, NULL, PFIL_WAITOK, pfh);

  I don't see ipfw_chk() getting triggered.

  Somehow incoming traffic is affecting the outgoing traffic?

 It seems I screwed up something in testing and the following does seem to do 
 the
 right thing:

 -   (hook_func, NULL, PFIL_IN | PFIL_OUT | PFIL_WAITOK, pfh);
 +   (hook_func, NULL, PFIL_IN | PFIL_WAITOK, pfh);

 I confirmed this with pmcstat callgraphs that ipfw_chk() is not getting
 called in OUT direction.

 Any thoughts on this? Is this something that can be upstreamed with a
 sysctl knob if there is interest?
Unfortunately, I've missed most of the thread.
First of all, just calling ipfw hook should not be so costy - I can believe in 
5% in 8/9/10, but the reason should be not ipfw, but rwlock which is used ti 
protect ruleset.
HEAD version should behave better (we observe something like 1% overhead on 
10+mpps rate) since it uses rmlock.

Being able to attach/detach appropriate L3 hooks is really a good idea, however 
the better (but not easier way) would be to implement something like
`pfilctl' utility which would be able to control/disable/reorder hooks on 
particular filtering points (pf before ipfw or vice versa). This stuff would 
also help to convert some of the current L2 processing code to pfil (like 
BPF,lagg,netgraph input hooks in eher_input) 


 cheers,
 Hiren
___
freebsd-ipfw@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-ipfw
To unsubscribe, send any mail to freebsd-ipfw-unsubscr...@freebsd.org

Re: The KASSERT from r282155 fired; have crash dump. will travel

2015-04-30 Thread Alexander V . Chernikov
30.04.2015, 15:31, David Wolfskill da...@catwhisker.org:
 From /var/crash/core.txt.6:

 Thu Apr 30 05:21:22 PDT 2015

 FreeBSD  11.0-CURRENT FreeBSD 11.0-CURRENT #47  r282269M/282269:1100071: Thu 
 Apr 30 05:07:08 PDT 2015 
 r...@g1-254.catwhisker.org:/common/S3/obj/usr/src/sys/CANARY  amd64

 panic: refcount incosistency: found: 0 unr: 0 total: 1
Could you share your ruleset?
(And this panic should happen on one particular rule, could check this?)

Thank you.

 ...
 Unread portion of the kernel message buffer:
 ...
 Reading symbols from /boot/kernel/if_lagg.ko.symbols...done.
 Loaded symbols for /boot/kernel/if_lagg.ko.symbols
 #0  doadump (textdump=Unhandled dwarf expression opcode 0x93
 ) at pcpu.h:219
 219 pcpu.h: No such file or directory.
 in pcpu.h
 (kgdb) #0  doadump (textdump=Unhandled dwarf expression opcode 0x93
 ) at pcpu.h:219
 #1  0x80355f9e in db_dump (dummy=value optimized out, 
 dummy2=Unhandled dwarf expression opcode 0x93
 )
 at /usr/src/sys/ddb/db_command.c:533
 #2  0x80355b17 in db_command (cmd_table=0x0)
 at /usr/src/sys/ddb/db_command.c:440
 #3  0x80355794 in db_command_loop ()
 at /usr/src/sys/ddb/db_command.c:493
 #4  0x80358350 in db_trap (type=value optimized out, code=Unhandled 
 dwarf expression opcode 0x93
 )
 at /usr/src/sys/ddb/db_main.c:251
 #5  0x8096a1b4 in kdb_trap (type=Unhandled dwarf expression opcode 
 0x93
 ) at /usr/src/sys/kern/subr_kdb.c:654
 #6  0x80cd61fe in trap (frame=0xfe060cc04220)
 at /usr/src/sys/amd64/amd64/trap.c:540
 #7  0x80cb6702 in calltrap ()
 at /usr/src/sys/amd64/amd64/exception.S:235
 #8  0x8096988e in kdb_enter (why=0x80f5f68a panic,
 msg=0x80974750 
 UH\211åAWAVATSH\203ìPI\211÷A\211þH\213\004%Ðvd\201H\211EØ\201%x\206d\201) 
 at cpufunc.h:63
 #9  0x8092d949 in vpanic (fmt=value optimized out,
 ap=value optimized out) at /usr/src/sys/kern/kern_shutdown.c:739
 #10 0x8092d792 in kassert_panic (fmt=value optimized out)
 at /usr/src/sys/kern/kern_shutdown.c:634
 #11 0x80b56c11 in ipfw_rewrite_rule_uidx (chain=0x817b2be0,
 ci=0xfe060cc04508) at /usr/src/sys/netpfil/ipfw/ip_fw_table.c:3402
 #12 0x80b51613 in commit_rules (chain=0x817b2be0,
 rci=0xfe060cc04508, count=Unhandled dwarf expression opcode 0x93
 )
 at /usr/src/sys/netpfil/ipfw/ip_fw_sockopt.c:675
 #13 0x80b533ed in add_rules (chain=0x817b2be0,
 op3=value optimized out, sd=value optimized out)
 at /usr/src/sys/netpfil/ipfw/ip_fw_sockopt.c:2589
 #14 0x80b4f9bb in ipfw_ctl3 (sopt=0xfe060cc04920)
 at /usr/src/sys/netpfil/ipfw/ip_fw_sockopt.c:3189
 #15 0x809be84e in kern_getsockopt (td=0xf8000a48b000,
 s=value optimized out, level=value optimized out,
 name=value optimized out, val=value optimized out, valseg=Unhandled 
 dwarf expression opcode 0x93
 )
 at /usr/src/sys/kern/uipc_syscalls.c:1531
 #16 0x809be750 in sys_getsockopt (td=0xf8000a48b000,
 uap=0xfe060cc04a40) at /usr/src/sys/kern/uipc_syscalls.c:1477
 #17 0x80cd714c in amd64_syscall (td=0xf8000a48b000, traced=0)
 at subr_syscall.c:133
 #18 0x80cb69eb in Xfast_syscall ()
 at /usr/src/sys/amd64/amd64/exception.S:395
 #19 0x000800b1caaa in ?? ()
 Previous frame inner to this frame (corrupt stack?)
 Current language:  auto; currently minimal
 (kgdb)
 

 I haven't tried i386 yet.  I can make the core.txt.6 file available.

 This was on my laptop, where I use ipfw in a failover LAGG environment.

 Anything else I can provide to help fix this?

 Peace,
 david
 --
 David H. Wolfskill da...@catwhisker.org
 Those who murder in the name of God or prophet are blasphemous cowards.

 See http://www.catwhisker.org/~david/publickey.gpg for my public key.
___
freebsd-ipfw@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-ipfw
To unsubscribe, send any mail to freebsd-ipfw-unsubscr...@freebsd.org

Re: panic: resize_storage() notify failure [Was: HEADS UP: Merging projects/ipfw to HEAD]

2014-10-11 Thread Alexander V. Chernikov

On 11.10.2014 18:15, David Wolfskill wrote:

On Sat, Oct 04, 2014 at 04:35:51PM +0400, Alexander V. Chernikov wrote:

Hi,

I'm going to merge projects/ipfw branch to HEAD in the middle of next week.


OK; I was able to build  install head @r272938 this morning on my
laptop; on reboot, I was greeted by a panic.

Ups. Not the best greeting, definitely.

Can you send me ipfw ruleset?


Now, this is a laptop, so I don't have a serial console -- but I was
able to call doadump, then reboot with the wireless NIC disabled (to

Do you have some hooks to run ipfw on iface-up?

avoid the panic) and get the dump  core.txt captured.

Here's the first chunk of the core.txt file:

localhost dumped core - see /var/crash/vmcore.0

Sat Oct 11 07:02:26 PDT 2014

FreeBSD localhost 11.0-CURRENT FreeBSD 11.0-CURRENT #1392  
r272938M/272938:1100037: Sat Oct 11 05:44:30 PDT 2014 
r...@g1-235.catwhisker.org:/common/S4/obj/usr/src/sys/CANARY  i386

panic: resize_storage() notify failure

GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type show copying to see the conditions.
There is absolutely no warranty for GDB.  Type show warranty for details.
This GDB was configured as i386-marcel-freebsd...

Unread portion of the kernel message buffer:
panic: resize_storage() notify failure
cpuid = 0
KDB: stack backtrace:
db_trace_self_wrapper(c10ebfd8,d1070720,fc,1000,1,...) at 0xc0528cdd = 
db_trace_self_wrapper+0x2d/frame 0xfa0cc508
kdb_backtrace(c12a9e27,0,c111af52,fa0cc5dc,fa0cc598,...) at 0xc0b22180 = 
kdb_backtrace+0x30/frame 0xfa0cc570
vpanic(c1447c52,100,c111af52,fa0cc5dc,fa0cc5dc,...) at 0xc0ae7b8d = 
vpanic+0x11d/frame 0xfa0cc5ac
kassert_panic(c111af52,fa0cc6f8,223,1e8,c0b71417,...) at 0xc0ae7a6a = 
kassert_panic+0xea/frame 0xfa0cc5d0
ipfw_link_table_values(c1518498,fa0cc6f8,25a,fa0cc728,c1469c5c,...) at 
0xc0d25cfd = ipfw_link_table_values+0x5ed/frame 0xfa0cc6a0
add_table_entry(c1518498,fa0cc7f0,fa0cc800,0,1,...) at 0xc0d1be78 = 
add_table_entry+0x348/frame 0xfa0cc7c8
manage_table_ent_v1(c1518498,fa0cca08,fa0cc870,8,c0d17710,...) at 0xc0d202b9 = 
manage_table_ent_v1+0x1c9/frame 0xfa0cc828
ipfw_ctl3(fa0ccbe0,2,fa0ccba8,c0a9ffc4,fa0ccbd0,...) at 0xc0d1834d = 
ipfw_ctl3+0xacd/frame 0xfa0ccb20
rip_ctloutput(d2432dc0,fa0ccbe0,,27f,1f,...) at 0xc0c3cf49 = 
rip_ctloutput+0x299/frame 0xfa0ccb48
sogetopt(d2432dc0,fa0ccbe0,fa0ccbd0,0,fa0ccbf8,...) at 0xc0b6c670 = 
sogetopt+0xb0/frame 0xfa0ccba8
kern_getsockopt(d03afc40,4,0,30,bfbfd850,...) at 0xc0b71556 = 
kern_getsockopt+0x116/frame 0xfa0ccc0c
sys_getsockopt(d03afc40,fa08,c12ab55e,d5,c1455210,...) at 0xc0b71417 = 
sys_getsockopt+0x67/frame 0xfa0ccc40
syscall(fa0ccd08) at 0xc0f7c76b = syscall+0x31b/frame 0xfa0cccfc
Xint0x80_syscall() at 0xc0f665b1 = Xint0x80_syscall+0x21/frame 0xfa0cccfc
--- syscall (118, FreeBSD ELF32, sys_getsockopt), eip = 0x2815a3c7, esp = 
0xbfbfd2e4, ebp = 0xbfbfd300 ---
KDB: enter: panic

Reading symbols from /boot/kernel/linux.ko.symbols...done.
Loaded symbols for /boot/kernel/linux.ko.symbols
Reading symbols from /boot/kernel/coretemp.ko.symbols...done.
Loaded symbols for /boot/kernel/coretemp.ko.symbols
Reading symbols from /boot/kernel/iwn5000fw.ko.symbols...done.
Loaded symbols for /boot/kernel/iwn5000fw.ko.symbols
Reading symbols from /boot/modules/nvidia.ko...done.
Loaded symbols for /boot/modules/nvidia.ko
Reading symbols from /boot/kernel/tmpfs.ko.symbols...done.
Loaded symbols for /boot/kernel/tmpfs.ko.symbols
Reading symbols from /boot/kernel/fdescfs.ko.symbols...done.
Loaded symbols for /boot/kernel/fdescfs.ko.symbols
Reading symbols from /boot/kernel/linprocfs.ko.symbols...done.
Loaded symbols for /boot/kernel/linprocfs.ko.symbols
#0  doadump (textdump=0) at pcpu.h:233
233 pcpu.h: No such file or directory.
in pcpu.h
(kgdb) #0  doadump (textdump=0) at pcpu.h:233
#1  0xc0526acd in db_fncall (dummy1=-99826980, dummy2=0, dummy3=1573888,
 dummy4=0xfa0cc2b4 \036\211\220À¸\026MÁ)
 at /usr/src/sys/ddb/db_command.c:578
#2  0xc05267ab in db_command (cmd_table=value optimized out)
 at /usr/src/sys/ddb/db_command.c:449
#3  0xc05264f0 in db_command_loop () at /usr/src/sys/ddb/db_command.c:502
#4  0xc0528e20 in db_trap (type=value optimized out,
 code=value optimized out) at /usr/src/sys/ddb/db_main.c:251
#5  0xc0b226f4 in kdb_trap (type=value optimized out,
 code=value optimized out, tf=value optimized out)
 at /usr/src/sys/kern/subr_kdb.c:654
#6  0xc0f7ba87 in trap (frame=value optimized out)
 at /usr/src/sys/i386/i386/trap.c:693
#7  0xc0f6651c in calltrap () at /usr/src/sys/i386/i386/exception.s:169
#8  0xc0b21f7d in kdb_enter (why=0xc10e77dd panic,
 msg=value optimized out) at cpufunc.h:71
#9  0xc0ae7bb1 in vpanic (fmt=value optimized out, ap=value optimized out)
 at /usr/src/sys/kern/kern_shutdown.c:739
#10

Re: HEADS UP: Merging projects/ipfw to HEAD

2014-10-09 Thread Alexander V . Chernikov
On 04 Oct 2014, at 16:35, Alexander V. Chernikov melif...@freebsd.org wrote:

 Hi,
 
 I'm going to merge projects/ipfw branch to HEAD in the middle of next week.
Merged in r 272840.
 
 What has changed:
 
 Main user-visible changes are related to tables:
 
 * Tables are now identified by names, not numbers. There can be up to 65k 
 tables with up to 63-byte long names.
 * Tables are now set-aware (default off), so you can switch/move them 
 atomically with rules.
 * More functionality is supported (swap, lock, limits, user-level lookup, 
 batched add/del) by generic table code.
 * New table types are added (flow) so you can match multiple packet fields at 
 once.
 * Ability to add different type of lookup algorithms for particular table 
 type has been added.
 * New table algorithms are added (cidr:hash, iface:array, number:array and 
 flow:hash) to make certain types of lookup more effective.
 * Table value are now capable of holding multiple data fields for different 
 tablearg users
 
 Some examples (see ipfw(8) manual page for the description):
 
  0:02 [2] zfscurr0# ipfw table fl2 create type flow:src-ip,proto,dst-port 
 algo flow:hash valtype skipto,fib
   0:02 [2] zfscurr0# ipfw table fl2 info
   +++ table(fl2), set(0) +++
kindex: 0, type: flow:src-ip,proto,dst-port
valtype: number, references: 0
algorithm: flow:hash
items: 0, size: 280
   0:02 [2] zfscurr0# ipfw table fl2 add 2a02:6b8::333,tcp,443 45000,12
   0:02 [2] zfscurr0# ipfw table fl2 add 10.0.0.92,tcp,80 22000,13
   0:02 [2] zfscurr0# ipfw table fl2 list
   +++ table(fl2), set(0) +++
   2a02:6b8::333,6,443 45000
   10.0.0.92,6,80 22000
   0:02 [2] zfscurr0# ipfw add 200 count tcp from me to 78.46.89.105 80 flow 
 'table(fl2)'
 
   ipfw table mi_test create type cidr algo cidr:hash masks=/30,/64
   ipfw table mi_test add 10.0.0.8/30
   ipfw table mi_test add 2a02:6b8:b010::1/64 25
 
   # ipfw table si add 1.1.1.1/32  2.2.2.2/32 
   added: 1.1.1.1/32 
   added: 2.2.2.2/32 
   # ipfw table si add 2.2.2.2/32 2200 4.4.4.4/32 
   exists: 2.2.2.2/32 2200
   added: 4.4.4.4/32 
   ipfw: Adding record failed: record already exists
   ^ Returns error but keeps inserted items
   # ipfw table si list
   +++ table(si), set(0) +++
   1.1.1.1/32 
   2.2.2.2/32 
   4.4.4.4/32 
   # ipfw table si atomic add 3.3.3.3/32  4.4.4.4/32 4400 5.5.5.5/32 
   added(reverted): 3.3.3.3/32 
   exists: 4.4.4.4/32 4400
   ignored: 5.5.5.5/32 
   ipfw: Adding record failed: record already exists
   ^ Returns error and reverts added records
 
 Performance changes:
 * Main ipfw lock was converted to rmlock
 * Rule counters were separated from rule itself and made per-cpu.
 * Radix table entries fits into 128 bytes
 * struct ip_fw is now more compact so more rules will fit into 64 bytes
 * interface tables uses array of existing ifindexes for faster match
 
 ABI changes:
 All functionality supported by old ipfw(8) remains functional. Old  new 
 binaries can work together with the following restrictions:
 * Tables named other than ^\d+$ are shown as table(65535) in ruleset in old 
 binaries
 * I'm a bit unsure about lookup src-port|dst-port N case, something may be 
 broken here. Anyway, this can be fixed for MFC
 
 Internal changes:.
 Changing table ids to numbers resulted in format modification for most 
 sockopt codes.
 Old sopt format was compact, but very hard to extend (no versioning, 
 inability to add more opcodes), so
 * All relevant opcodes were converted to TLV-based versioned IP_FW3-based 
 codes.
 * The remaining opcodes were also converted to be able to eliminate all older 
 opcodes at once
 * All IP_FW3 handlers uses special API instead of calling sooptcopy* directly 
 to ease adding another communication methods
 * struct ip_fw is now different for kernel and userland
 * tablearg value has been changed to 0 to ease future extensions
 * table values are now indexes in special value array which holds extended 
 data for given index
 * Batched add/delete has been added to tables code
 * Most changes has been done to permit batched rule addition.
 * interface tracking API has been added (started on demand) to permit 
 effective interface tables operations
 * O(1) skipto cache, currently turned off by default at compile-time (eats 
 512K).
 
 * Several steps has been made towards making libipfw:
  * most of new functions were separated into parse/prepare/show and 
 actuall-do-stuff pieces (already merged).
  * there are separate functions for parsing text string into struct ip_fw 
 and printing struct ip_fw to supplied buffer (already merged).
 * Probably some more less significant/forgotten features
 
 ___
 freebsd-...@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-net
 To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
 

___
freebsd-ipfw@freebsd.org

Re: HEADS UP: Merging projects/ipfw to HEAD

2014-10-05 Thread Alexander V. Chernikov

On 05.10.2014 14:13, Willem Jan Withagen wrote:

On 5-10-2014 4:18, John W. O'Brien wrote:

On 10/4/14 8:35 AM, Alexander V. Chernikov wrote:

Hi,

I'm going to merge projects/ipfw branch to HEAD in the middle of next week.

Alexander,

Nice job..

The change list looks impressive.
Really looking forward to start working with the new table styles and
options.. It will take time to get a real grasp of what new
opportunities have become possible.

Not running any HEAD systems at the moment, but looking eagerly for a
possible MFC to STABLE.
Is that in the start at any point in time?

I plan to merge it in 1 moth after committing to HEAD.


--WjW





___
freebsd-ipfw@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-ipfw
To unsubscribe, send any mail to freebsd-ipfw-unsubscr...@freebsd.org


Re: HEADS UP: Merging projects/ipfw to HEAD

2014-10-05 Thread Alexander V. Chernikov

On 05.10.2014 20:33, Jan Bramkamp wrote:

On 04.10.2014 14:35, Alexander V. Chernikov wrote:

Hi,

I'm going to merge projects/ipfw branch to HEAD in the middle of next
week.

What has changed:

Main user-visible changes are related to tables:

* Tables are now identified by names, not numbers. There can be up to
65k tables with up to 63-byte long names.
* Tables are now set-aware (default off), so you can switch/move them
atomically with rules.
* More functionality is supported (swap, lock, limits, user-level
lookup, batched add/del) by generic table code.
* New table types are added (flow) so you can match multiple packet
fields at once.
* Ability to add different type of lookup algorithms for particular
table type has been added.
* New table algorithms are added (cidr:hash, iface:array, number:array
and flow:hash) to make certain types of lookup more effective.
* Table value are now capable of holding multiple data fields for
different tablearg users

Are IPv6 addresses supported as tablearg (in fwd)?

Well, _currently_ not.
However, it can be done in 1-2 hours of work.
You already can specify IPv6 address as one of the value types for tablearg,
the only thing that needs to be implemented is runtime code that applies 
this tablearg.

___
freebsd-ipfw@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-ipfw
To unsubscribe, send any mail to freebsd-ipfw-unsubscr...@freebsd.org



___
freebsd-ipfw@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-ipfw
To unsubscribe, send any mail to freebsd-ipfw-unsubscr...@freebsd.org


HEADS UP: Merging projects/ipfw to HEAD

2014-10-04 Thread Alexander V. Chernikov

Hi,

I'm going to merge projects/ipfw branch to HEAD in the middle of next week.

What has changed:

Main user-visible changes are related to tables:

* Tables are now identified by names, not numbers. There can be up to 
65k tables with up to 63-byte long names.
* Tables are now set-aware (default off), so you can switch/move them 
atomically with rules.
* More functionality is supported (swap, lock, limits, user-level 
lookup, batched add/del) by generic table code.
* New table types are added (flow) so you can match multiple packet 
fields at once.
* Ability to add different type of lookup algorithms for particular 
table type has been added.
* New table algorithms are added (cidr:hash, iface:array, number:array 
and flow:hash) to make certain types of lookup more effective.
* Table value are now capable of holding multiple data fields for 
different tablearg users


Some examples (see ipfw(8) manual page for the description):

  0:02 [2] zfscurr0# ipfw table fl2 create type 
flow:src-ip,proto,dst-port algo flow:hash valtype skipto,fib

   0:02 [2] zfscurr0# ipfw table fl2 info
   +++ table(fl2), set(0) +++
kindex: 0, type: flow:src-ip,proto,dst-port
valtype: number, references: 0
algorithm: flow:hash
items: 0, size: 280
   0:02 [2] zfscurr0# ipfw table fl2 add 2a02:6b8::333,tcp,443 45000,12
   0:02 [2] zfscurr0# ipfw table fl2 add 10.0.0.92,tcp,80 22000,13
   0:02 [2] zfscurr0# ipfw table fl2 list
   +++ table(fl2), set(0) +++
   2a02:6b8::333,6,443 45000
   10.0.0.92,6,80 22000
   0:02 [2] zfscurr0# ipfw add 200 count tcp from me to 78.46.89.105 80 
flow 'table(fl2)'


   ipfw table mi_test create type cidr algo cidr:hash masks=/30,/64
   ipfw table mi_test add 10.0.0.8/30
   ipfw table mi_test add 2a02:6b8:b010::1/64 25

   # ipfw table si add 1.1.1.1/32  2.2.2.2/32 
   added: 1.1.1.1/32 
   added: 2.2.2.2/32 
   # ipfw table si add 2.2.2.2/32 2200 4.4.4.4/32 
   exists: 2.2.2.2/32 2200
   added: 4.4.4.4/32 
   ipfw: Adding record failed: record already exists
   ^ Returns error but keeps inserted items
   # ipfw table si list
   +++ table(si), set(0) +++
   1.1.1.1/32 
   2.2.2.2/32 
   4.4.4.4/32 
   # ipfw table si atomic add 3.3.3.3/32  4.4.4.4/32 4400 
5.5.5.5/32 

   added(reverted): 3.3.3.3/32 
   exists: 4.4.4.4/32 4400
   ignored: 5.5.5.5/32 
   ipfw: Adding record failed: record already exists
   ^ Returns error and reverts added records

Performance changes:
* Main ipfw lock was converted to rmlock
* Rule counters were separated from rule itself and made per-cpu.
* Radix table entries fits into 128 bytes
* struct ip_fw is now more compact so more rules will fit into 64 bytes
* interface tables uses array of existing ifindexes for faster match

ABI changes:
All functionality supported by old ipfw(8) remains functional. Old  new 
binaries can work together with the following restrictions:
* Tables named other than ^\d+$ are shown as table(65535) in ruleset in 
old binaries
* I'm a bit unsure about lookup src-port|dst-port N case, something 
may be broken here. Anyway, this can be fixed for MFC


Internal changes:.
Changing table ids to numbers resulted in format modification for most 
sockopt codes.
Old sopt format was compact, but very hard to extend (no versioning, 
inability to add more opcodes), so
* All relevant opcodes were converted to TLV-based versioned 
IP_FW3-based codes.
* The remaining opcodes were also converted to be able to eliminate all 
older opcodes at once
* All IP_FW3 handlers uses special API instead of calling sooptcopy* 
directly to ease adding another communication methods

* struct ip_fw is now different for kernel and userland
* tablearg value has been changed to 0 to ease future extensions
* table values are now indexes in special value array which holds 
extended data for given index

* Batched add/delete has been added to tables code
* Most changes has been done to permit batched rule addition.
* interface tracking API has been added (started on demand) to permit 
effective interface tables operations
* O(1) skipto cache, currently turned off by default at compile-time 
(eats 512K).


* Several steps has been made towards making libipfw:
  * most of new functions were separated into parse/prepare/show and 
actuall-do-stuff pieces (already merged).
  * there are separate functions for parsing text string into struct 
ip_fw and printing struct ip_fw to supplied buffer (already merged).

* Probably some more less significant/forgotten features

___
freebsd-ipfw@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-ipfw
To unsubscribe, send any mail to freebsd-ipfw-unsubscr...@freebsd.org


Re: High intr CPU % and slow throughput

2014-09-19 Thread Alexander V. Chernikov
On 18.09.2014 23:30, Freddie Cash wrote:
 ​Aha!  I believe I've found the cause of our current issue.
 
 In an effort to allow reloading of the firewall rules during the day
 without disconnecting anyone (dropping TCP connections), I started playing
 with rule sets.  And everything appeared to be working wonderfully, in that
 I could restart the rules multiple times without dropping any packets or
 disconnecting anyone.
 
 But, CPU usage skyrocketed on large downloads and ​
 
 ​we were capped at a little less than 40 Mbps.  :(
 
 It seems that if you do the following (at least twice, to make sure rules
 are in both sets), your CPU will melt:
   - clear set 1
   - disable set 1
   - load 4000 rules into set 1
   - enable set 1
   - swap sets 1 and 0
   - disable set 1
 
 ​I thought that would leave only the rules in set 0 active, which would be
 the equivalent of only having loaded rules into set 0.  However, it seems
 that ipfw still checks rules in disabled sets!  Or does some kind of
 processing with disabled sets.
Yes. _All_ rules in all sets are referenced inside single array.
ipfw does not process disabled rules itself, but
1) it has to load given rule to cpu cache
2) check if disabled set mask matches

 
 pmcstat was showing lots (200-2000) of unresolved samples and ipfw.ko
 sitting at 80-90% in the list, even when CPU usage was around 30%.
 
 I did the above, but added ipfw -f set 1 flush as the last step, and
 everything is back to normal.  pmstat is now empty (0 unresolved).
 
 We can now push 75 Mbps through the firewall with CPU usage under 80%.
  More importantly, though, other traffic is not impacted by large downloads
 and speedtests and streaming video!  And, CPU usage is sitting at under 10%
 for normal traffic.
 
 ​Yes, I know 4000 rules is ​a lot (doing NAT for 66 systems and 2 local
 subnets).  Until now I was focusing on getting things working (migrating
 from FreeBSD 7 using IPFW+natd with lots of private IP to private IP rules;
 to FreeBSD 10 using IPFW + in-kernel NAT and proper double-NAT across
 networks using public IPs only).  Optimisation work is just now beginning.
  :)
 

___
freebsd-ipfw@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-ipfw
To unsubscribe, send any mail to freebsd-ipfw-unsubscr...@freebsd.org

Re: High intr CPU % and slow throughput

2014-09-18 Thread Alexander V. Chernikov
On 18.09.2014 20:26, Freddie Cash wrote:
 [Not sure if this is more appropriate for the -ipfw or -stable mailing
 lists.]
 
 
 64-bit FreeBSD 10.0-p7
 
 Dual-core AMD Opteron 1218 CPU @ 2.6 GHz
 ​2 GB of DDR2 RAM
 Intel i350-T4 quad-port gigabit NIC using igb(4)
 
 Each of the gigabit NIC ports are connected to gigabit links (we have a
 gigabit fibre link to our ISP, which has dual 10 Gbps links to the public
 Internet).
 
 Using the following simple ruleset (there are more rules, but these are the
 ones that match when we test transfers to/from the Internet):
Please show all the ruleset with counters.

 
 ipfw nat 8668 config ip 142.24.
 ​x.y​
 same_ports
 
 10 allow ip from any to any via lo0
 12 allow carp from any to any
 
 20 reject log logamount 1 ip from 10.0.0.0/8 to any in recv igb0
 22 reject log logamount 1 ip from 127.0.0.0/8 to any in recv igb0
 ​2​
 4 reject log logamount 1 ip from 172.16.0.0/20 to any in recv igb0
 26 reject log logamount 1 ip from 192.168.0.0/16 to any in recv igb0
 
 50 skipto 65000 ip from 192.168.0.0/24 to not 142.24.
 ​x.z
 /25 in recv igb2
 52 skipto 65000 ip from not 142.24.13.128/25 to 142.24.
 ​x.y
  in recv igb0
 
 65000 allow ip from 192.168.0.0/24 to any in recv igb2
 65002 nat 8668 ip from 192.168.0.0/24 to any out xmit igb0
 65004 allow ip from 142.24.
 ​x.y​
 to any out xmit igb0
 
 65006 nat 8668 ip from any to 142.24.
 ​x.y​
 in recv igb0
 65008 allow ip from any to 192.168.0.0/24 in recv igb0
 65010 allow ip from any to 192.168.0.0/24 out xmit igb2

 
 When we start a large download or file transfer from the Internet (a single
 file from a single server), CPU usage for the [intr{irq256: igb0:que}]
 kernel thread jumps to over 90% (one CPU core) and causes all traffic
 through the firewall (even traffic that doesn't go through igb0) to grind
 to a standstill.  Some TCP connections through other interfaces are even
 dropped.​  During this time, the other CPU core is under 50% usage.
can you do the following:
kldload hwpmc
sudo pmcstat -TS instructions -w 1

and show its output when the problem is observed?
 
 IIUIC, the [intr{irq256: igb0:que}] isn't showing actual CPU usage for
 processing hardware interrupts, but is showing the CPU usage used to
 process the packets going through IPFW.  Correct?  vmstat -i shows only
 10-15 interrupts per second for each of the igb interfaces.
 
 The really depressing part is that throughput (as shown by iftop -i igb0
 and snmp graphing) never goes above 40 Mbps.  :(
 
 What can I do to try and track down exactly why this is occurring?
 
 Is there anything I can do to reduce or mitigate this CPU usage?
 
 Or, is this simply a case of the CPU being too old?
 
 /boot/loader.conf currently has the following (been playing with most of
 these lately, without much change in CPU usage):
 
 ##  Tune the igb(4) interfaces a little
 hw.igb.enable_aim=1
 hw.igb.enable_msix=1
 hw.igb.header_split=0
 hw.igb.max_interrupt_rate=16000
 hw.igb.num_queues=0
 hw.igb.rx_process_limit=1000
 hw.igb.rxd=4096
 hw.igb.txd=4096
 
 ##  Configure kernel
 kern.hz=4000
 
 ##  Configure IPFW
 net.inet.ip.fw.default_to_accept=1
 net.inet.ip.fw.verbose=1
 
 ##  Configure network threads
 net.isr.bindthreads=1
 net.isr.direct=1
 net.isr.maxthreads=2
 
 
 ​/etc/sysctl.conf has the following (haven't changed these in a long time):
 
 ​# IPFW options
 net.inet.ip.fw.autoinc_step=2
 net.inet.ip.fw.enable=1
 net.inet.ip.fw.one_pass=1
 net.inet.ip.fw.verbose=1
 net.inet.ip.fw.verbose_limit=1
 
 
 At lunch today, we'll be failing-over to the other firewall, which will be
 running without any /boot/loader.conf or /etc/sysctl.conf entries to see if
 my optimisations are actually pessimisations.
 
 

___
freebsd-ipfw@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-ipfw
To unsubscribe, send any mail to freebsd-ipfw-unsubscr...@freebsd.org

Re: IPFW rule sets and automatic rule numbering

2014-09-13 Thread Alexander V. Chernikov

On 11.09.2014 19:01, Freddie Cash wrote:

Good morning everyone,

Just wondering if I'm doing things wrong, or if those two features (rule
sets and auto incrementing rule numbers) just don't play well together.

Until now, I've used the auto-incrementing feature to minimize the amount
of work I need to do when changing/updating/adding rules in the middle of
my scripts.  This has been working great, and is controlled via
the net.inet.ip.fw.autoinc_step sysctl.

Recently I was playing with the rule sets feature and using ipfw set swap
to speed up my firewall rules reloading times.  Previously, I'd clear the
rules, then load the new rules, but that could leave up to 30 seconds of
downtime.  With the use of sets, that's under 1 sec.

Everything works well on the first run.  Everything is loaded correctly
into set 1, then swapped into set 0 and made live.  All rules are numbered
correctly.

On the second run, all the rules are loaded into set 1 using rule numbers
65524-65534, and then swapped into set 0.

On the third (and all subsequent run), all rules are loaded into set 1 with
rule number 65534, and then swapping into set 1.

It seems the rule numbers are global across all sets?  Meaning, the last
used automatic number is global across all sets?

I was expecting the rule numbers to be unique per set.  I do the following
to clear out rule set 1 before adding rules:

ipfw -f set 1 flush
ipfw set disable 1

Then load all my rules into set 1 using the following syntax:

ipfw add set 1 allow tcp from 1.2.3.4 to 2.3.4.4 in recv igb0




Then swap the rules at the end using:

ipfw set swap 1 0

Is there anything I could be doing differently to get the numbering to work
the way I expect it to?  Or am I going to have to manually number every
rule in my scripts?

No, currently rule auto-numbering ignores sets.
So currently you have to to number rules manually to achieve predictable 
behavior.


I think we can consider implementing sysctl which permits per-set 
auto-numbering.







___
freebsd-ipfw@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-ipfw
To unsubscribe, send any mail to freebsd-ipfw-unsubscr...@freebsd.org


Re: ipfw named objejcts, table values and syntax change

2014-08-19 Thread Alexander V. Chernikov

On 15.08.2014 19:20, Alexander V. Chernikov wrote:

On 15.08.2014 18:19, Dmitry Selivanov wrote:

15.08.2014 17:25, Alexander V. Chernikov пишет:

On 08.08.2014 16:11, Dmitry Selivanov wrote:

04.08.2014 23:51, Alexander V. Chernikov пишет:

On 04.08.2014 15:58, Luigi Rizzo wrote:
On Mon, Aug 04, 2014 at 01:44:26PM +0400, Alexander V. Chernikov 
wrote:

On 02.08.2014 12:33, Alexander V. Chernikov wrote:

On 02.08.2014 10:33, Luigi Rizzo wrote:



On Fri, Aug 1, 2014 at 11:08 PM, Alexander V. Chernikov
melif...@freebsd.org mailto:melif...@freebsd.org wrote:

  Hello all.

  I'm currently working on to enhance ipfw in some areas.
  The most notable (and user-visible) change is named 
table support.
  The other one is support for different lookup algorithms 
for different

  key types.

  For example, new ipfw permits writing this:

  ipfw table tb1 create type cidr
  ipfw add allow ip from table(tl1) to any
  ipfw add allow ip from any lookup dst-ip tb1

  ipfw table if1 create type iface
  ipfw add skipto tablearg ip from any to any via table(if1)

  or even this:
  ipfw table fl1 create type 
flow:src-ip,proto,dst-ip,dst-port

  ipfw table fl1 add 10.0.0.5,tcp,10.0.0.6,80 
  ipfw add allow ip from any to any flow table(fl1)

  all these changes fully preserve backward compatibility.
  (actually tables needs now to be created before use and 
their type needs
  to match with opcode used, but new ipfw(8) performs 
auto-creation

  for cidr tables).

  There is another thing I'm going to change and I'm not 
sure I can keep

  the same compatibility level.

  Table values, from one point of view, can be classified 
to the following

  types:

  - skipto argument
  - fwd argument (*)
  - link to another object (nat, pipe, queue)
  - plain u32 (not bound to any object)
  (divert/tee,netgraph,tag/utag,limit)

  There are the following reasons why I think it is 
necessary to implement

  explicit table values typing (like tables):
  - Implementing fwd tablearg for IPv6 hosts requires 
indirection table
  - Converting nat/pipe instance ids to names renders 
values unusable
  - retiring old hack with storing saved pointer of found 
object/rule

  inside rule w/o proper locking
  - making faster skipto


??i don't buy the idea that you need typed arguments
for all the cases above. Maybe the case that
may make sense is the fwd argument (and in the future
something else).
We already discussed, i think, the fact that now it
is legal to have references to non existing things
(skipto, pipes etc.) implemented as u32.
Removing that would break configurations.

It depends on actual implementation. This can be preserved by
auto-creating necessary objects in kernel and/or in userspace, so
we can (and should) avoid breaking in this particular way.

Can you please explain your vision on values another time?
As far as I understand, you're not against it in general, but the
details matter:
* IP address can be one of the types (it won't break much, and 
we can

simply skip that one for MFC)
* what about typing for nat/pipes ? we're not going to convert 
their ids

to names? (or maybe you can suggest other non-disruptive way?)
* everything else is type u32


Correct, I am mostly concerned about the details, not on the 
general concept.


To summarize the discussion Alexander and I had about converting
identifiers from numbers to arbitrary strings (this is partly 
related
to the values stored in tables, but I think we should have a 
coherent

behaviour)

1. CURRENTLY ipfw uses numeric identifiers in a small range (16 
bits or less)

for rules, pipes, queues, tables, probably nat instances.

2. CURRENTLY, in all the above contexts, it is legal to reference a
non existing object (rule, pipe, table names, etc.),
and the kernel will do something reasonable, namely jump to the
next rule, drop traffic for non existing pipes, and so on.

3. of course we want to preserve backward compatibility both for
the ioctl interface, and for user configurations.

4. The in-kernel representation of identifiers is not visible to 
users,
so we can use a numeric representation in the kernel for 
identifiers.

Strings like 12345 are converted with atoi() or the like,
whereas for other identifiers or numbers outside of the 2^16 
range

the kernel manages a translation table, allocating new numeric
identifiers if a new string appears.
This permits backward compatibility for old rulesets, and 
does not

impact performance because the translation table is only
used during rules additions or deletion.
Yes. However this requires either holding either (1) 2 pointers 
(oldnew

arrays), or (2) 65k+ index array, or (3) chained hash table.
(1) would require additional pointers for each subsystem (and some
additional management),
(2) will definitely upset embedded guys

Re: ipfw named objejcts, table values and syntax change

2014-08-19 Thread Alexander V. Chernikov

On 19.08.2014 20:06, Dmitry Selivanov wrote:

19.08.2014 17:50, Alexander V. Chernikov пишет:

On 15.08.2014 19:20, Alexander V. Chernikov wrote:

On 15.08.2014 18:19, Dmitry Selivanov wrote:

15.08.2014 17:25, Alexander V. Chernikov пишет:

On 08.08.2014 16:11, Dmitry Selivanov wrote:

04.08.2014 23:51, Alexander V. Chernikov пишет:

On 04.08.2014 15:58, Luigi Rizzo wrote:
On Mon, Aug 04, 2014 at 01:44:26PM +0400, Alexander V. 
Chernikov wrote:

On 02.08.2014 12:33, Alexander V. Chernikov wrote:

On 02.08.2014 10:33, Luigi Rizzo wrote:



On Fri, Aug 1, 2014 at 11:08 PM, Alexander V. Chernikov
melif...@freebsd.org mailto:melif...@freebsd.org wrote:

  Hello all.

  I'm currently working on to enhance ipfw in some areas.
  The most notable (and user-visible) change is named 
table support.
  The other one is support for different lookup 
algorithms for different

  key types.

  For example, new ipfw permits writing this:

  ipfw table tb1 create type cidr
  ipfw add allow ip from table(tl1) to any
  ipfw add allow ip from any lookup dst-ip tb1

  ipfw table if1 create type iface
  ipfw add skipto tablearg ip from any to any via 
table(if1)


  or even this:
  ipfw table fl1 create type 
flow:src-ip,proto,dst-ip,dst-port

  ipfw table fl1 add 10.0.0.5,tcp,10.0.0.6,80 
  ipfw add allow ip from any to any flow table(fl1)

  all these changes fully preserve backward compatibility.
  (actually tables needs now to be created before use 
and their type needs
  to match with opcode used, but new ipfw(8) performs 
auto-creation

  for cidr tables).

  There is another thing I'm going to change and I'm not 
sure I can keep

  the same compatibility level.

  Table values, from one point of view, can be 
classified to the following

  types:

  - skipto argument
  - fwd argument (*)
  - link to another object (nat, pipe, queue)
  - plain u32 (not bound to any object)
  (divert/tee,netgraph,tag/utag,limit)

  There are the following reasons why I think it is 
necessary to implement

  explicit table values typing (like tables):
  - Implementing fwd tablearg for IPv6 hosts requires 
indirection table
  - Converting nat/pipe instance ids to names renders 
values unusable
  - retiring old hack with storing saved pointer of 
found object/rule

  inside rule w/o proper locking
  - making faster skipto


??i don't buy the idea that you need typed arguments
for all the cases above. Maybe the case that
may make sense is the fwd argument (and in the future
something else).
We already discussed, i think, the fact that now it
is legal to have references to non existing things
(skipto, pipes etc.) implemented as u32.
Removing that would break configurations.

It depends on actual implementation. This can be preserved by
auto-creating necessary objects in kernel and/or in 
userspace, so

we can (and should) avoid breaking in this particular way.

Can you please explain your vision on values another time?
As far as I understand, you're not against it in general, but the
details matter:
* IP address can be one of the types (it won't break much, and 
we can

simply skip that one for MFC)
* what about typing for nat/pipes ? we're not going to convert 
their ids

to names? (or maybe you can suggest other non-disruptive way?)
* everything else is type u32


Correct, I am mostly concerned about the details, not on the 
general concept.


To summarize the discussion Alexander and I had about converting
identifiers from numbers to arbitrary strings (this is partly 
related
to the values stored in tables, but I think we should have a 
coherent

behaviour)

1. CURRENTLY ipfw uses numeric identifiers in a small range (16 
bits or less)

for rules, pipes, queues, tables, probably nat instances.

2. CURRENTLY, in all the above contexts, it is legal to 
reference a

non existing object (rule, pipe, table names, etc.),
and the kernel will do something reasonable, namely jump to 
the

next rule, drop traffic for non existing pipes, and so on.

3. of course we want to preserve backward compatibility both for
the ioctl interface, and for user configurations.

4. The in-kernel representation of identifiers is not visible 
to users,
so we can use a numeric representation in the kernel for 
identifiers.

Strings like 12345 are converted with atoi() or the like,
whereas for other identifiers or numbers outside of the 
2^16 range

the kernel manages a translation table, allocating new numeric
identifiers if a new string appears.
This permits backward compatibility for old rulesets, and 
does not

impact performance because the translation table is only
used during rules additions or deletion.
Yes. However this requires either holding either (1) 2 pointers 
(oldnew

arrays), or (2) 65k+ index array, or (3) chained hash table.
(1) would require additional

Re: ipfw named objejcts, table values and syntax change

2014-08-15 Thread Alexander V. Chernikov

On 08.08.2014 16:11, Dmitry Selivanov wrote:

04.08.2014 23:51, Alexander V. Chernikov пишет:

On 04.08.2014 15:58, Luigi Rizzo wrote:

On Mon, Aug 04, 2014 at 01:44:26PM +0400, Alexander V. Chernikov wrote:

On 02.08.2014 12:33, Alexander V. Chernikov wrote:

On 02.08.2014 10:33, Luigi Rizzo wrote:



On Fri, Aug 1, 2014 at 11:08 PM, Alexander V. Chernikov
melif...@freebsd.org mailto:melif...@freebsd.org wrote:

  Hello all.

  I'm currently working on to enhance ipfw in some areas.
  The most notable (and user-visible) change is named table 
support.
  The other one is support for different lookup algorithms 
for different

  key types.

  For example, new ipfw permits writing this:

  ipfw table tb1 create type cidr
  ipfw add allow ip from table(tl1) to any
  ipfw add allow ip from any lookup dst-ip tb1

  ipfw table if1 create type iface
  ipfw add skipto tablearg ip from any to any via table(if1)

  or even this:
  ipfw table fl1 create type flow:src-ip,proto,dst-ip,dst-port
  ipfw table fl1 add 10.0.0.5,tcp,10.0.0.6,80 
  ipfw add allow ip from any to any flow table(fl1)

  all these changes fully preserve backward compatibility.
  (actually tables needs now to be created before use and 
their type needs
  to match with opcode used, but new ipfw(8) performs 
auto-creation

  for cidr tables).

  There is another thing I'm going to change and I'm not sure 
I can keep

  the same compatibility level.

  Table values, from one point of view, can be classified to 
the following

  types:

  - skipto argument
  - fwd argument (*)
  - link to another object (nat, pipe, queue)
  - plain u32 (not bound to any object)
  (divert/tee,netgraph,tag/utag,limit)

  There are the following reasons why I think it is necessary 
to implement

  explicit table values typing (like tables):
  - Implementing fwd tablearg for IPv6 hosts requires 
indirection table
  - Converting nat/pipe instance ids to names renders values 
unusable
  - retiring old hack with storing saved pointer of found 
object/rule

  inside rule w/o proper locking
  - making faster skipto


??i don't buy the idea that you need typed arguments
for all the cases above. Maybe the case that
may make sense is the fwd argument (and in the future
something else).
We already discussed, i think, the fact that now it
is legal to have references to non existing things
(skipto, pipes etc.) implemented as u32.
Removing that would break configurations.

It depends on actual implementation. This can be preserved by
auto-creating necessary objects in kernel and/or in userspace, so
we can (and should) avoid breaking in this particular way.

Can you please explain your vision on values another time?
As far as I understand, you're not against it in general, but the
details matter:
* IP address can be one of the types (it won't break much, and we can
simply skip that one for MFC)
* what about typing for nat/pipes ? we're not going to convert 
their ids

to names? (or maybe you can suggest other non-disruptive way?)
* everything else is type u32


Correct, I am mostly concerned about the details, not on the general 
concept.


To summarize the discussion Alexander and I had about converting
identifiers from numbers to arbitrary strings (this is partly related
to the values stored in tables, but I think we should have a coherent
behaviour)

1. CURRENTLY ipfw uses numeric identifiers in a small range (16 bits 
or less)

for rules, pipes, queues, tables, probably nat instances.

2. CURRENTLY, in all the above contexts, it is legal to reference a
non existing object (rule, pipe, table names, etc.),
and the kernel will do something reasonable, namely jump to the
next rule, drop traffic for non existing pipes, and so on.

3. of course we want to preserve backward compatibility both for
the ioctl interface, and for user configurations.

4. The in-kernel representation of identifiers is not visible to users,
so we can use a numeric representation in the kernel for 
identifiers.

Strings like 12345 are converted with atoi() or the like,
whereas for other identifiers or numbers outside of the 2^16 range
the kernel manages a translation table, allocating new numeric
identifiers if a new string appears.
This permits backward compatibility for old rulesets, and does not
impact performance because the translation table is only
used during rules additions or deletion.

Yes. However this requires either holding either (1) 2 pointers (oldnew
arrays), or (2) 65k+ index array, or (3) chained hash table.
(1) would require additional pointers for each subsystem (and some
additional management),
(2) will definitely upset embedded guys and
(3) is worse in terms of performance


With this in mind, i think we should follow a similar approach for
objects stored in tables, hence

if an u32 value

Re: ipfw named objejcts, table values and syntax change

2014-08-15 Thread Alexander V. Chernikov

On 15.08.2014 18:19, Dmitry Selivanov wrote:

15.08.2014 17:25, Alexander V. Chernikov пишет:

On 08.08.2014 16:11, Dmitry Selivanov wrote:

04.08.2014 23:51, Alexander V. Chernikov пишет:

On 04.08.2014 15:58, Luigi Rizzo wrote:
On Mon, Aug 04, 2014 at 01:44:26PM +0400, Alexander V. Chernikov 
wrote:

On 02.08.2014 12:33, Alexander V. Chernikov wrote:

On 02.08.2014 10:33, Luigi Rizzo wrote:



On Fri, Aug 1, 2014 at 11:08 PM, Alexander V. Chernikov
melif...@freebsd.org mailto:melif...@freebsd.org wrote:

  Hello all.

  I'm currently working on to enhance ipfw in some areas.
  The most notable (and user-visible) change is named table 
support.
  The other one is support for different lookup algorithms 
for different

  key types.

  For example, new ipfw permits writing this:

  ipfw table tb1 create type cidr
  ipfw add allow ip from table(tl1) to any
  ipfw add allow ip from any lookup dst-ip tb1

  ipfw table if1 create type iface
  ipfw add skipto tablearg ip from any to any via table(if1)

  or even this:
  ipfw table fl1 create type flow:src-ip,proto,dst-ip,dst-port
  ipfw table fl1 add 10.0.0.5,tcp,10.0.0.6,80 
  ipfw add allow ip from any to any flow table(fl1)

  all these changes fully preserve backward compatibility.
  (actually tables needs now to be created before use and 
their type needs
  to match with opcode used, but new ipfw(8) performs 
auto-creation

  for cidr tables).

  There is another thing I'm going to change and I'm not 
sure I can keep

  the same compatibility level.

  Table values, from one point of view, can be classified 
to the following

  types:

  - skipto argument
  - fwd argument (*)
  - link to another object (nat, pipe, queue)
  - plain u32 (not bound to any object)
  (divert/tee,netgraph,tag/utag,limit)

  There are the following reasons why I think it is 
necessary to implement

  explicit table values typing (like tables):
  - Implementing fwd tablearg for IPv6 hosts requires 
indirection table
  - Converting nat/pipe instance ids to names renders 
values unusable
  - retiring old hack with storing saved pointer of found 
object/rule

  inside rule w/o proper locking
  - making faster skipto


??i don't buy the idea that you need typed arguments
for all the cases above. Maybe the case that
may make sense is the fwd argument (and in the future
something else).
We already discussed, i think, the fact that now it
is legal to have references to non existing things
(skipto, pipes etc.) implemented as u32.
Removing that would break configurations.

It depends on actual implementation. This can be preserved by
auto-creating necessary objects in kernel and/or in userspace, so
we can (and should) avoid breaking in this particular way.

Can you please explain your vision on values another time?
As far as I understand, you're not against it in general, but the
details matter:
* IP address can be one of the types (it won't break much, and we 
can

simply skip that one for MFC)
* what about typing for nat/pipes ? we're not going to convert 
their ids

to names? (or maybe you can suggest other non-disruptive way?)
* everything else is type u32


Correct, I am mostly concerned about the details, not on the 
general concept.


To summarize the discussion Alexander and I had about converting
identifiers from numbers to arbitrary strings (this is partly related
to the values stored in tables, but I think we should have a coherent
behaviour)

1. CURRENTLY ipfw uses numeric identifiers in a small range (16 
bits or less)

for rules, pipes, queues, tables, probably nat instances.

2. CURRENTLY, in all the above contexts, it is legal to reference a
non existing object (rule, pipe, table names, etc.),
and the kernel will do something reasonable, namely jump to the
next rule, drop traffic for non existing pipes, and so on.

3. of course we want to preserve backward compatibility both for
the ioctl interface, and for user configurations.

4. The in-kernel representation of identifiers is not visible to 
users,
so we can use a numeric representation in the kernel for 
identifiers.

Strings like 12345 are converted with atoi() or the like,
whereas for other identifiers or numbers outside of the 2^16 
range

the kernel manages a translation table, allocating new numeric
identifiers if a new string appears.
This permits backward compatibility for old rulesets, and does 
not

impact performance because the translation table is only
used during rules additions or deletion.
Yes. However this requires either holding either (1) 2 pointers 
(oldnew

arrays), or (2) 65k+ index array, or (3) chained hash table.
(1) would require additional pointers for each subsystem (and some
additional management),
(2) will definitely upset embedded guys and
(3) is worse in terms of performance

Re: [CFT] new tables for ipfw

2014-08-14 Thread Alexander V. Chernikov

On 14.08.2014 13:23, Luigi Rizzo wrote:




On Wed, Aug 13, 2014 at 10:11 PM, Alexander V. Chernikov 
melif...@yandex-team.ru mailto:melif...@yandex-team.ru wrote:


Hello list.

I've been hacking ipfw for a while and It seems there is something
ready to test/review in projects/ipfw branch.


​this is a fantastic piece of work, thanks for doing it and for
integrating the feedback.
​
I have some detailed feedback that will send you privately,
but just a curiosity:

​...​

Some examples (see ipfw(8) manual page for the description):

​...


  ipfw table mi_test create type cidr algo cidr:hash masks=/30,/64


​why do we need to specify mask lengths in the above​ ?
Well, since we're hashing IP we have to know mask to cut host bits in 
advance.
(And the real reason is that I'm too lazy to implement hierarchical 
matching (check /32, then /31, then /30) like how, for example,
this is done in ipset), so this particular algorithm supports only 
single IPv4 and single IPv6 mask.

Anyway, it is not too hard to add another algo which is doing the above.



cheers
luigi



___
freebsd-ipfw@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-ipfw
To unsubscribe, send any mail to freebsd-ipfw-unsubscr...@freebsd.org

Re: [CFT] new tables for ipfw

2014-08-14 Thread Alexander V. Chernikov

On 14.08.2014 14:44, Luigi Rizzo wrote:




On Thu, Aug 14, 2014 at 11:57 AM, Alexander V. Chernikov 
melif...@yandex-team.ru mailto:melif...@yandex-team.ru wrote:


On 14.08.2014 13:23, Luigi Rizzo wrote:




On Wed, Aug 13, 2014 at 10:11 PM, Alexander V. Chernikov
melif...@yandex-team.ru mailto:melif...@yandex-team.ru wrote:

Hello list.

I've been hacking ipfw for a while and It seems there is
something ready to test/review in projects/ipfw branch.


​this is a fantastic piece of work, thanks for doing it and for
integrating the feedback.
​
I have some detailed feedback that will send you privately,
but just a curiosity:

​...​

Some examples (see ipfw(8) manual page for the description):

​...


  ipfw table mi_test create type cidr algo cidr:hash
masks=/30,/64


​why do we need to specify mask lengths in the above​ ?

Well, since we're hashing IP we have to know mask to cut host bits
in advance.
(And the real reason is that I'm too lazy to implement
hierarchical  matching (check /32, then /31, then /30) like how,
for example,


​oh well for that we should use cidr:radix

Research results have never shown a strong superiority of
hierarchical hash tables over good radix implementations,
and in those cases one usually adopts partial prefix
expansion so you only have, say, masks that are a
multiple of 2..8 bits so you only need a small number of
hash lookups.
Definitely, especially for IPv6. So I was actually thinking about 
covering some special sparse cases (e.g. someone having a bunch of /32 
and a bunch of /30 and that's all).


Btw, since we're talking about good radix implementation: what license 
does DXR have? :)

Is it OK to merge it as another cidr implementation?



​cheers
luigi​



___
freebsd-ipfw@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-ipfw
To unsubscribe, send any mail to freebsd-ipfw-unsubscr...@freebsd.org

Re: [CFT] new tables for ipfw

2014-08-14 Thread Alexander V. Chernikov

On 14.08.2014 15:15, Luigi Rizzo wrote:




On Thu, Aug 14, 2014 at 12:57 PM, Alexander V. Chernikov 
melif...@yandex-team.ru mailto:melif...@yandex-team.ru wrote:


On 14.08.2014 14:44, Luigi Rizzo wrote:




On Thu, Aug 14, 2014 at 11:57 AM, Alexander V. Chernikov
melif...@yandex-team.ru mailto:melif...@yandex-team.ru wrote:

On 14.08.2014 13:23, Luigi Rizzo wrote:




On Wed, Aug 13, 2014 at 10:11 PM, Alexander V. Chernikov
melif...@yandex-team.ru mailto:melif...@yandex-team.ru
wrote:

Hello list.

I've been hacking ipfw for a while and It seems there is
something ready to test/review in projects/ipfw branch.


​this is a fantastic piece of work, thanks for doing it and for
integrating the feedback.
​
I have some detailed feedback that will send you privately,
but just a curiosity:

​...​

Some examples (see ipfw(8) manual page for the description):

​...


  ipfw table mi_test create type cidr algo cidr:hash
masks=/30,/64


​why do we need to specify mask lengths in the above​ ?

Well, since we're hashing IP we have to know mask to cut host
bits in advance.
(And the real reason is that I'm too lazy to implement
hierarchical  matching (check /32, then /31, then /30) like
how, for example,


​oh well for that we should use cidr:radix

Research results have never shown a strong superiority of
hierarchical hash tables over good radix implementations,
and in those cases one usually adopts partial prefix
expansion so you only have, say, masks that are a
multiple of 2..8 bits so you only need a small number of
hash lookups.

Definitely, especially for IPv6. So I was actually thinking about
covering some special sparse cases (e.g. someone having a bunch of
/32 and a bunch of /30 and that's all).

Btw, since we're talking about good radix implementation: what
license does DXR have? :)
Is it OK to merge it as another cidr implementation?

cidr is a very ugly name, i'd rather use addr

Ok, no problem with that. addr really sounds better.


DXR has a ​bsd license and of course it is possible to use it.
You should ask Marko Zec for his latest version of the code
(and probably make sure we have one copy of the code in the source tree).

Great!. I'll ask him :)


Speaking of features, one thing that would be nice is the ability
for tables to reference the in-kernel tables (e.g. fibs, socket
lists, interface lists...), perhaps in readonly mode.
How complex do you think that would be ?
Implementing algo support for particular provider like sockets/iflists 
shouldn't be hard. Most of the algorithms complexity lies in table 
modifications. Here we have to support
lookup and dump operations, so it is the question of providing necessary 
bindings to existing mechanisms (via some direct binding or utilizing 
things like kernel_sysctl for dump support).


It looks like the following maps well to current table concept:
* such tables are not created by default
* user issues
 `ipfw table kfib create type addr algo addr:kernel fib=0`
or
 `ipfw table ktcp create type flow algo flow:kernel_tcp fib=0`
or
`ipfw table kiface create type iface algo iface:kernel`
* tables have special readonly type, flush_all requests are ignored
* no state stored internally

So generic table handling code needs to be modified to support read-only 
tables (and making more callbacks optional).
Additionally, we might need to proxy info request info algo callback 
(optional, real algorithms won't implement it) to be able to show 
number of items (and some other info) to user.






cheers
luigi



___
freebsd-ipfw@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-ipfw
To unsubscribe, send any mail to freebsd-ipfw-unsubscr...@freebsd.org

Re: [CFT] new tables for ipfw

2014-08-14 Thread Alexander V. Chernikov

On 14.08.2014 16:08, Willem Jan Withagen wrote:

On 2014-08-14 13:15, Luigi Rizzo wrote:

On Thu, Aug 14, 2014 at 12:57 PM, Alexander V. Chernikov 
melif...@yandex-team.ru wrote:


  On 14.08.2014 14:44, Luigi Rizzo wrote:




On Thu, Aug 14, 2014 at 11:57 AM, Alexander V. Chernikov 
melif...@yandex-team.ru wrote:


   On 14.08.2014 13:23, Luigi Rizzo wrote:




On Wed, Aug 13, 2014 at 10:11 PM, Alexander V. Chernikov 
melif...@yandex-team.ru wrote:


Hello list.

I've been hacking ipfw for a while and It seems there is something 
ready

to test/review in projects/ipfw branch.



  ​this is a fantastic piece of work, thanks for doing it and for
integrating the feedback.
  ​
I have some detailed feedback that will send you privately,
  but just a curiosity:

   ​...​


Some examples (see ipfw(8) manual page for the description):


​...


   ipfw table mi_test create type cidr algo cidr:hash masks=/30,/64



  ​why do we need to specify mask lengths in the above​ ?

  Well, since we're hashing IP we have to know mask to cut host 
bits in

advance.
(And the real reason is that I'm too lazy to implement hierarchical
matching (check /32, then /31, then /30) like how, for example,



  ​oh well for that we should use cidr:radix

  Research results have never shown a strong superiority of
hierarchical hash tables over good radix implementations,
  and in those cases one usually adopts partial prefix
expansion so you only have, say, masks that are a
  multiple of 2..8 bits so you only need a small number of
hash lookups.

Definitely, especially for IPv6. So I was actually thinking about 
covering
some special sparse cases (e.g. someone having a bunch of /32 and a 
bunch

of /30 and that's all).

Btw, since we're talking about good radix implementation: what 
license

does DXR have? :)
Is it OK to merge it as another cidr implementation?



cidr is a very ugly name, i'd rather use addr

DXR has a ​bsd license and of course it is possible to use it.
You should ask Marko Zec for his latest version of the code
(and probably make sure we have one copy of the code in the source 
tree).


Speaking of features, one thing that would be nice is the ability
for tables to reference the in-kernel tables (e.g. fibs, socket
lists, interface lists...), perhaps in readonly mode.
How complex do you think that would be ?


I'm a very happy user of ipfw and I think these are nice improvements 
and will make things more flexible...


I have 2 nits to pick with the current version.

I've found the notation ipnr:something rather frustrating when using 
ipv6 addresses. Sort of like typing a ipv6 address in a browser, the 
last :xx is always interpreted as portnumber, UNLESS you wrap it in []'s.

compare
2001:4cb8:3:1::1
2001:4cb8:3:1::1:80
[2001:4cb8:3:1::1]:80
The first and the last are the same host but a different port, the 
middle one is just a different host.


Could/should we do the same in ipfw?
Well, we should, but I'm unsure if we have host:port notation anywhere 
in current (or new) syntax:


And I keep running into the
ipfw add deny all from table(50) to any
notation. the ()'s need to be escaped in most any shell. Where as I 
look at the syntax there is little reason to require the ()'s.
the keyword table always needs to be followed by a number (and in the 
new version a (word|number) ).
We need _some_ discriminator to ensure that the next parameter after 
to or from is not hostname.
We also have some other places where tables are used: via 
interface|table(X), lookup X, flow table(X) [new].
I agree that parenthesis might not be the best choice. (and something 
like :tablename:, %tablename%, or even table:tablename might look better).
Theoretically, we can support both (old/new) and show rules with new one 
by default.




Thanx for the nice work,
--WjW



___
freebsd-ipfw@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-ipfw
To unsubscribe, send any mail to freebsd-ipfw-unsubscr...@freebsd.org

Re: [CFT] new tables for ipfw

2014-08-14 Thread Alexander V. Chernikov
On 14.08.2014 15:52, Alexander V. Chernikov wrote:
 On 14.08.2014 15:15, Luigi Rizzo wrote:



 On Thu, Aug 14, 2014 at 12:57 PM, Alexander V. Chernikov
 melif...@yandex-team.ru mailto:melif...@yandex-team.ru wrote:

 On 14.08.2014 14:44, Luigi Rizzo wrote:



 On Thu, Aug 14, 2014 at 11:57 AM, Alexander V. Chernikov
 melif...@yandex-team.ru mailto:melif...@yandex-team.ru wrote:

 On 14.08.2014 13:23, Luigi Rizzo wrote:



 On Wed, Aug 13, 2014 at 10:11 PM, Alexander V. Chernikov
 melif...@yandex-team.ru mailto:melif...@yandex-team.ru
 wrote:

 Hello list.

 I've been hacking ipfw for a while and It seems there
 is something ready to test/review in projects/ipfw branch.


 ​this is a fantastic piece of work, thanks for doing it and for
 integrating the feedback.
 ​
 I have some detailed feedback that will send you privately,
 but just a curiosity:

 ​...​

 Some examples (see ipfw(8) manual page for the
 description):

  
 ​...


   ipfw table mi_test create type cidr algo cidr:hash
 masks=/30,/64


 ​why do we need to specify mask lengths in the above​ ?
 Well, since we're hashing IP we have to know mask to cut
 host bits in advance.
 (And the real reason is that I'm too lazy to implement
 hierarchical  matching (check /32, then /31, then /30) like
 how, for example,


 ​oh well for that we should use cidr:radix

 Research results have never shown a strong superiority of
 hierarchical hash tables over good radix implementations,
 and in those cases one usually adopts partial prefix
 expansion so you only have, say, masks that are a
 multiple of 2..8 bits so you only need a small number of
 hash lookups.
 Definitely, especially for IPv6. So I was actually thinking about
 covering some special sparse cases (e.g. someone having a bunch
 of /32 and a bunch of /30 and that's all).

 Btw, since we're talking about good radix implementation: what
 license does DXR have? :)
 Is it OK to merge it as another cidr implementation?

  
 cidr is a very ugly name, i'd rather use addr
 Ok, no problem with that. addr really sounds better.

 DXR has a ​bsd license and of course it is possible to use it.
 You should ask Marko Zec for his latest version of the code
 (and probably make sure we have one copy of the code in the source tree).
 Great!. I'll ask him :)

 Speaking of features, one thing that would be nice is the ability
 for tables to reference the in-kernel tables (e.g. fibs, socket
 lists, interface lists...), perhaps in readonly mode.
 How complex do you think that would be ?
Well, the most major problem is that tables handling code assumed that
we do known number of items in advance, and since we're holding locks it
won't change, so we don't need large contigious buffer to dump data to.
This is not the case with external tables, so we can't _reliably_ dump
them (the same situation as in case of dynamic states).
Anyway, I've added cidr:kfib algo (
http://svnweb.freebsd.org/base?view=revisionrevision=270001 ) and it
looks funny.
Quoting commit message:

# ipfw table fib2 create algo cidr:kfib fib=2
# ipfw table fib2 info
+++ table(fib2), set(0) +++
 kindex: 2, type: cidr, locked
 valtype: number, references: 0
 algorithm: cidr:kfib fib=2
 items: 11, size: 288
# ipfw table fib2 list
+++ table(fib2), set(0) +++
10.0.0.0/24 0
127.0.0.1/32 0
::/96 0
::1/128 0
:::0.0.0.0/96 0
2a02:978:2::/112 0
fe80::/10 0
fe80:1::/64 0
fe80:2::/64 0
fe80:3::/64 0
ff02::/16 0
# ipfw table fib2 lookup 10.0.0.5
10.0.0.0/24 0
# ipfw table fib2 lookup 2a02:978:2::11 
2a02:978:2::/112 0
# ipfw table fib2 detail  
+++ table(fib2), set(0) +++
 kindex: 2, type: cidr, locked
 valtype: number, references: 0
 algorithm: cidr:kfib fib=2
 items: 11, size: 288
 IPv4 algorithm radix info
  items: 0 itemsize: 200
 IPv6 algorithm radix info
  items: 0 itemsize: 200

 Implementing algo support for particular provider like sockets/iflists
 shouldn't be hard. Most of the algorithms complexity lies in table
 modifications. Here we have to support
 lookup and dump operations, so it is the question of providing
 necessary bindings to existing mechanisms (via some direct binding or
 utilizing things like kernel_sysctl for dump support).

 It looks like the following maps well to current table concept:
 * such tables are not created by default
 * user issues
  `ipfw table kfib create type addr algo addr:kernel fib=0`
 or
  `ipfw table ktcp create type flow algo flow:kernel_tcp fib=0`
 or
 `ipfw table kiface create type iface algo iface:kernel`
 * tables have special readonly type, flush_all requests are ignored
 * no state stored internally

 So generic table handling code needs to be modified to support
 read-only

[CFT] new tables for ipfw

2014-08-13 Thread Alexander V. Chernikov

Hello list.

(sorry for posting twice, patch seems to be too big to be posted as 
attachment).


I've been hacking ipfw for a while and It seems there is something ready 
to test/review in projects/ipfw branch.


Main user-visible changes are related to tables:

1) Tables are now identified by names, not numbers. There can be up to 
65k tables with up to 63-byte long names (*1).
2) Tables are now set-aware (default off), so you can switch/move them 
atomically with rules.
3) More functionality is supported (swap, lock, limits, user-level 
lookup, batched add/del) by generic table code.
4) New table types are added (flow) so you can match multiple packet 
fields at once.
5) Ability to add different type of lookup algorithms for particular 
table type has been added.
5) New table algorithms are added (cidr:hash, iface:array, number:array 
and flow:hash) to make certain types of lookup more effective.


6) No ABI breakage has happened: all functionality supported by old 
ipfw(8) remains functional. Old  new binaries can work together with 
the following restrictions:
* Tables named other than ^\d+$ are shown as table(65535) in ruleset in 
old binaries
* I'm a bit unsure about lookup src-port|dst-port N case, something 
may be broken here. Anyway, this can be fixed for MFC.


Some examples (see ipfw(8) manual page for the description):

  0:02 [2] zfscurr0# ipfw table fl2 create type 
flow:src-ip,proto,dst-port algo flow:hash

   0:02 [2] zfscurr0# ipfw table fl2 info
   +++ table(fl2), set(0) +++
kindex: 0, type: flow:src-ip,proto,dst-port
valtype: number, references: 0
algorithm: flow:hash
items: 0, size: 280
   0:02 [2] zfscurr0# ipfw table fl2 add 2a02:6b8::333,tcp,443 45000
   0:02 [2] zfscurr0# ipfw table fl2 add 10.0.0.92,tcp,80 22000
   0:02 [2] zfscurr0# ipfw table fl2 list
   +++ table(fl2), set(0) +++
   2a02:6b8::333,6,443 45000
   10.0.0.92,6,80 22000
   0:02 [2] zfscurr0# ipfw add 200 count tcp from me to 78.46.89.105 80 
flow 'table(fl2)'


   ipfw table mi_test create type cidr algo cidr:hash masks=/30,/64
   ipfw table mi_test add 10.0.0.8/30
   ipfw table mi_test add 2a02:6b8:b010::1/64 25

   # ipfw table si add 1.1.1.1/32  2.2.2.2/32 
   added: 1.1.1.1/32 
   added: 2.2.2.2/32 
   # ipfw table si add 2.2.2.2/32 2200 4.4.4.4/32 
   exists: 2.2.2.2/32 2200
   added: 4.4.4.4/32 
   ipfw: Adding record failed: record already exists
   ^ Returns error but keeps inserted items
   # ipfw table si list
   +++ table(si), set(0) +++
   1.1.1.1/32 
   2.2.2.2/32 
   4.4.4.4/32 
   # ipfw table si atomic add 3.3.3.3/32  4.4.4.4/32 4400 
5.5.5.5/32 

   added(reverted): 3.3.3.3/32 
   exists: 4.4.4.4/32 4400
   ignored: 5.5.5.5/32 
   ipfw: Adding record failed: record already exists
   ^ Returns error and reverts added records


IPFW internals has also changed significantly, mostly 
userland-interaction part.
Changing table ids to numbers resulted in format modification for most 
sockopt codes.
Old sopt format was compact, but very hard to extend (no versioning, 
inability to add more opcodes), so
1) All relevant opcodes were converted to TLV-based versioned 
IP_FW3-based codes.
2) The remaining opcodes (except NAT handlers) were also converted to be 
able to eliminate all older opcodes at once
3) All IP_FW3 handlers uses special API instead of calling sooptcopy* 
directly to ease adding another communication methods

4) struct ip_fw is now different for kernel and userland
5) tablearg value has been changed to 0 to ease future extensions
6) Batched add/delete has been added to tables code
7) Batched rule addition is coming soon (most of the changes has been 
already done)
8) interface tracking API has been added (started on demand) to permit 
effective interface tables operations


9) O(1) skipto cache (*2), currently turned on by default (eats 512K). 
This has to be made optional
10) Rule counters were separated from rule itself and made per-cpu. 
However, this part is not finished yet (problems with timestamps/api)

11) Make radix entries fit into 128 bytes
12) Make struct ip_fw more compact so more rules will fit into 64 bytes
13) Make interface tables use array of existing ifindexes for faster match

14) Several steps has been made towards making libipfw:
* most of new functions were separated into parse/prepare/show and 
actuall-do-stuff pieces.
* there are separate functions for parsing text string into struct 
ip_fw and printing struct ip_fw to supplied buffer.

15) Probably some more less significant/forgotten features

This is not final version: probably more documentation/style is 
required, there are definitely some uncaught bugs, and so on.


However, test/feedback/review is welcome.

All these changes are available in projects/ipfw branch (synced to 
recent -HEAD), but may be easily applied to recent 9/10 (at least kernel 
part).

Branch: svn://svn.freebsd.org/base/projects/ipfw
Web: 

Re: ipfw named objejcts, table values and syntax change

2014-08-04 Thread Alexander V. Chernikov

On 02.08.2014 12:33, Alexander V. Chernikov wrote:

On 02.08.2014 10:33, Luigi Rizzo wrote:



On Fri, Aug 1, 2014 at 11:08 PM, Alexander V. Chernikov
melif...@freebsd.org mailto:melif...@freebsd.org wrote:

 Hello all.

 I'm currently working on to enhance ipfw in some areas.
 The most notable (and user-visible) change is named table support.
 The other one is support for different lookup algorithms for different
 key types.

 For example, new ipfw permits writing this:

 ipfw table tb1 create type cidr
 ipfw add allow ip from table(tl1) to any
 ipfw add allow ip from any lookup dst-ip tb1

 ipfw table if1 create type iface
 ipfw add skipto tablearg ip from any to any via table(if1)

 or even this:
 ipfw table fl1 create type flow:src-ip,proto,dst-ip,dst-port
 ipfw table fl1 add 10.0.0.5,tcp,10.0.0.6,80 
 ipfw add allow ip from any to any flow table(fl1)

 all these changes fully preserve backward compatibility.
 (actually tables needs now to be created before use and their type needs
 to match with opcode used, but new ipfw(8) performs auto-creation
 for cidr tables).

 There is another thing I'm going to change and I'm not sure I can keep
 the same compatibility level.

 Table values, from one point of view, can be classified to the following
 types:

 - skipto argument
 - fwd argument (*)
 - link to another object (nat, pipe, queue)
 - plain u32 (not bound to any object)
 (divert/tee,netgraph,tag/utag,limit)

 There are the following reasons why I think it is necessary to implement
 explicit table values typing (like tables):
 - Implementing fwd tablearg for IPv6 hosts requires indirection table
 - Converting nat/pipe instance ids to names renders values unusable
 - retiring old hack with storing saved pointer of found object/rule
 inside rule w/o proper locking
 - making faster skipto


​​i don't buy the idea that you need typed arguments
for all the cases above. Maybe the case that
may make sense is the fwd argument (and in the future
something else).
We already discussed, i think, the fact that now it
is legal to have references to non existing things
(skipto, pipes etc.) implemented as u32.
Removing that would break configurations.

It depends on actual implementation. This can be preserved by
auto-creating necessary objects in kernel and/or in userspace, so
we can (and should) avoid breaking in this particular way.

Can you please explain your vision on values another time?
As far as I understand, you're not against it in general, but the 
details matter:
* IP address can be one of the types (it won't break much, and we can 
simply skip that one for MFC)
* what about typing for nat/pipes ? we're not going to convert their ids 
to names? (or maybe you can suggest other non-disruptive way?)

* everything else is type u32


Efficiency is not affected, even for skipto,

It depends on workload. While binary search is fast in terms of cpu, it
is may be not so fast in terms of memory (since each of the rule is
allocated by separate malloc() (and that is another thing which is worth
discussing)).


and while i agree that unprotected writes to the pointers
in rules should not happen, these pointers are changed
infrequently so a global read-mostly lock should be
sufficient to protect all changes to the rules.

cheers
luigi


 So, as the result, table will have lookup key type (already done),
 value type ('skipto', 'nexthop', 'nat', 'pipe', 'number', ..) and some
 additional restrictions (like inability to add non-existing nat instance
 id).

 This change will break (at least) scenarios where people are
 using one table for both nat/pipe instances (and keep nat ids in sync
 with pipe ones). For example:

 ipfw table 1 add 10.0.10.0/24 http://10.0.10.0/24 110
 ipfw table 1 add 10.0.20.0/24 http://10.0.20.0/24 120

 ipfw add 100 nat tablearg from table(1) to any via vlanX in
 ..
 ipfw add 500 pipe tablearg from table(1) to any via ix0 out

 It looks like it is not so easy to bind values for given table to
 different objects (or different tasks) (and lack of compatibility kills
 hope for MFC).

 Ideas?






 ___
 freebsd-ipfw@freebsd.org mailto:freebsd-ipfw@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-ipfw
 To unsubscribe, send any mail to
 freebsd-ipfw-unsubscr...@freebsd.org
 mailto:freebsd-ipfw-unsubscr...@freebsd.org




--
-+---
  Prof. Luigi RIZZO, ri...@iet.unipi.it mailto:ri...@iet.unipi.it  .
Dip. di Ing. dell'Informazione
  http://www.iet.unipi.it/~luigi/. Universita` di Pisa
  TEL  +39-050-2211611 tel:%2B39-050-2211611   . via
Diotisalvi 2
  Mobile   +39-338-6809875 tel:%2B39-338-6809875

Re: ipfw named objejcts, table values and syntax change

2014-08-04 Thread Alexander V. Chernikov
On 04.08.2014 15:58, Luigi Rizzo wrote:
 On Mon, Aug 04, 2014 at 01:44:26PM +0400, Alexander V. Chernikov wrote:
 On 02.08.2014 12:33, Alexander V. Chernikov wrote:
 On 02.08.2014 10:33, Luigi Rizzo wrote:


 On Fri, Aug 1, 2014 at 11:08 PM, Alexander V. Chernikov
 melif...@freebsd.org mailto:melif...@freebsd.org wrote:

  Hello all.

  I'm currently working on to enhance ipfw in some areas.
  The most notable (and user-visible) change is named table support.
  The other one is support for different lookup algorithms for different
  key types.

  For example, new ipfw permits writing this:

  ipfw table tb1 create type cidr
  ipfw add allow ip from table(tl1) to any
  ipfw add allow ip from any lookup dst-ip tb1

  ipfw table if1 create type iface
  ipfw add skipto tablearg ip from any to any via table(if1)

  or even this:
  ipfw table fl1 create type flow:src-ip,proto,dst-ip,dst-port
  ipfw table fl1 add 10.0.0.5,tcp,10.0.0.6,80 
  ipfw add allow ip from any to any flow table(fl1)

  all these changes fully preserve backward compatibility.
  (actually tables needs now to be created before use and their type 
 needs
  to match with opcode used, but new ipfw(8) performs auto-creation
  for cidr tables).

  There is another thing I'm going to change and I'm not sure I can keep
  the same compatibility level.

  Table values, from one point of view, can be classified to the 
 following
  types:

  - skipto argument
  - fwd argument (*)
  - link to another object (nat, pipe, queue)
  - plain u32 (not bound to any object)
  (divert/tee,netgraph,tag/utag,limit)

  There are the following reasons why I think it is necessary to 
 implement
  explicit table values typing (like tables):
  - Implementing fwd tablearg for IPv6 hosts requires indirection table
  - Converting nat/pipe instance ids to names renders values unusable
  - retiring old hack with storing saved pointer of found object/rule
  inside rule w/o proper locking
  - making faster skipto


 ??i don't buy the idea that you need typed arguments
 for all the cases above. Maybe the case that
 may make sense is the fwd argument (and in the future
 something else).
 We already discussed, i think, the fact that now it
 is legal to have references to non existing things
 (skipto, pipes etc.) implemented as u32.
 Removing that would break configurations.
 It depends on actual implementation. This can be preserved by
 auto-creating necessary objects in kernel and/or in userspace, so
 we can (and should) avoid breaking in this particular way.
 Can you please explain your vision on values another time?
 As far as I understand, you're not against it in general, but the 
 details matter:
 * IP address can be one of the types (it won't break much, and we can 
 simply skip that one for MFC)
 * what about typing for nat/pipes ? we're not going to convert their ids 
 to names? (or maybe you can suggest other non-disruptive way?)
 * everything else is type u32
 
 Correct, I am mostly concerned about the details, not on the general concept.
 
 To summarize the discussion Alexander and I had about converting
 identifiers from numbers to arbitrary strings (this is partly related
 to the values stored in tables, but I think we should have a coherent
 behaviour)
 
 1. CURRENTLY ipfw uses numeric identifiers in a small range (16 bits or less)
for rules, pipes, queues, tables, probably nat instances.
 
 2. CURRENTLY, in all the above contexts, it is legal to reference a
non existing object (rule, pipe, table names, etc.),
and the kernel will do something reasonable, namely jump to the
next rule, drop traffic for non existing pipes, and so on.
 
 3. of course we want to preserve backward compatibility both for
the ioctl interface, and for user configurations.
 
 4. The in-kernel representation of identifiers is not visible to users,
so we can use a numeric representation in the kernel for identifiers.
Strings like 12345 are converted with atoi() or the like,
whereas for other identifiers or numbers outside of the 2^16 range
the kernel manages a translation table, allocating new numeric
identifiers if a new string appears.
This permits backward compatibility for old rulesets, and does not
impact performance because the translation table is only
used during rules additions or deletion.
Yes. However this requires either holding either (1) 2 pointers (oldnew
arrays), or (2) 65k+ index array, or (3) chained hash table.
(1) would require additional pointers for each subsystem (and some
additional management),
(2) will definitely upset embedded guys and
(3) is worse in terms of performance
 
 With this in mind, i think we should follow a similar approach for
 objects stored in tables, hence
 
   if an u32 value was available in the past, it must be
   available also in the new

Re: ipfw named objejcts, table values and syntax change

2014-08-02 Thread Alexander V. Chernikov
On 02.08.2014 10:33, Luigi Rizzo wrote:
 
 
 
 On Fri, Aug 1, 2014 at 11:08 PM, Alexander V. Chernikov
 melif...@freebsd.org mailto:melif...@freebsd.org wrote:
 
 Hello all.
 
 I'm currently working on to enhance ipfw in some areas.
 The most notable (and user-visible) change is named table support.
 The other one is support for different lookup algorithms for different
 key types.
 
 For example, new ipfw permits writing this:
 
 ipfw table tb1 create type cidr
 ipfw add allow ip from table(tl1) to any
 ipfw add allow ip from any lookup dst-ip tb1
 
 ipfw table if1 create type iface
 ipfw add skipto tablearg ip from any to any via table(if1)
 
 or even this:
 ipfw table fl1 create type flow:src-ip,proto,dst-ip,dst-port
 ipfw table fl1 add 10.0.0.5,tcp,10.0.0.6,80 
 ipfw add allow ip from any to any flow table(fl1)
 
 all these changes fully preserve backward compatibility.
 (actually tables needs now to be created before use and their type needs
 to match with opcode used, but new ipfw(8) performs auto-creation
 for cidr tables).
 
 There is another thing I'm going to change and I'm not sure I can keep
 the same compatibility level.
 
 Table values, from one point of view, can be classified to the following
 types:
 
 - skipto argument
 - fwd argument (*)
 - link to another object (nat, pipe, queue)
 - plain u32 (not bound to any object)
 (divert/tee,netgraph,tag/utag,limit)
 
 There are the following reasons why I think it is necessary to implement
 explicit table values typing (like tables):
 - Implementing fwd tablearg for IPv6 hosts requires indirection table
 - Converting nat/pipe instance ids to names renders values unusable
 - retiring old hack with storing saved pointer of found object/rule
 inside rule w/o proper locking
 - making faster skipto
 
 
 ​​i don't buy the idea that you need typed arguments
 for all the cases above. Maybe the case that
 may make sense is the fwd argument (and in the future
 something else).
 We already discussed, i think, the fact that now it
 is legal to have references to non existing things
 (skipto, pipes etc.) implemented as u32.
 Removing that would break configurations.
It depends on actual implementation. This can be preserved by
auto-creating necessary objects in kernel and/or in userspace, so
we can (and should) avoid breaking in this particular way.
 
 Efficiency is not affected, even for skipto,
It depends on workload. While binary search is fast in terms of cpu, it
is may be not so fast in terms of memory (since each of the rule is
allocated by separate malloc() (and that is another thing which is worth
discussing)).

 and while i agree that unprotected writes to the pointers
 in rules should not happen, these pointers are changed
 infrequently so a global read-mostly lock should be
 sufficient to protect all changes to the rules.
 
 cheers
 luigi
 
 
 So, as the result, table will have lookup key type (already done),
 value type ('skipto', 'nexthop', 'nat', 'pipe', 'number', ..) and some
 additional restrictions (like inability to add non-existing nat instance
 id).
 
 This change will break (at least) scenarios where people are
 using one table for both nat/pipe instances (and keep nat ids in sync
 with pipe ones). For example:
 
 ipfw table 1 add 10.0.10.0/24 http://10.0.10.0/24 110
 ipfw table 1 add 10.0.20.0/24 http://10.0.20.0/24 120
 
 ipfw add 100 nat tablearg from table(1) to any via vlanX in
 ..
 ipfw add 500 pipe tablearg from table(1) to any via ix0 out
 
 It looks like it is not so easy to bind values for given table to
 different objects (or different tasks) (and lack of compatibility kills
 hope for MFC).
 
 Ideas?
 
 
 
 
 
 
 ___
 freebsd-ipfw@freebsd.org mailto:freebsd-ipfw@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-ipfw
 To unsubscribe, send any mail to
 freebsd-ipfw-unsubscr...@freebsd.org
 mailto:freebsd-ipfw-unsubscr...@freebsd.org
 
 
 
 
 -- 
 -+---
  Prof. Luigi RIZZO, ri...@iet.unipi.it mailto:ri...@iet.unipi.it  .
 Dip. di Ing. dell'Informazione
  http://www.iet.unipi.it/~luigi/. Universita` di Pisa
  TEL  +39-050-2211611 tel:%2B39-050-2211611   . via
 Diotisalvi 2
  Mobile   +39-338-6809875 tel:%2B39-338-6809875   . 56122
 PISA (Italy)
 -+---

___
freebsd-ipfw@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-ipfw
To unsubscribe, send any mail to freebsd-ipfw-unsubscr...@freebsd.org

ipfw named objejcts, table values and syntax change

2014-08-01 Thread Alexander V. Chernikov
Hello all.

I'm currently working on to enhance ipfw in some areas.
The most notable (and user-visible) change is named table support.
The other one is support for different lookup algorithms for different
key types.

For example, new ipfw permits writing this:

ipfw table tb1 create type cidr
ipfw add allow ip from table(tl1) to any
ipfw add allow ip from any lookup dst-ip tb1

ipfw table if1 create type iface
ipfw add skipto tablearg ip from any to any via table(if1)

or even this:
ipfw table fl1 create type flow:src-ip,proto,dst-ip,dst-port
ipfw table fl1 add 10.0.0.5,tcp,10.0.0.6,80 
ipfw add allow ip from any to any flow table(fl1)

all these changes fully preserve backward compatibility.
(actually tables needs now to be created before use and their type needs
to match with opcode used, but new ipfw(8) performs auto-creation
for cidr tables).

There is another thing I'm going to change and I'm not sure I can keep
the same compatibility level.

Table values, from one point of view, can be classified to the following
types:

- skipto argument
- fwd argument (*)
- link to another object (nat, pipe, queue)
- plain u32 (not bound to any object) (divert/tee,netgraph,tag/utag,limit)

There are the following reasons why I think it is necessary to implement
explicit table values typing (like tables):
- Implementing fwd tablearg for IPv6 hosts requires indirection table
- Converting nat/pipe instance ids to names renders values unusable
- retiring old hack with storing saved pointer of found object/rule
inside rule w/o proper locking
- making faster skipto

So, as the result, table will have lookup key type (already done),
value type ('skipto', 'nexthop', 'nat', 'pipe', 'number', ..) and some
additional restrictions (like inability to add non-existing nat instance
id).

This change will break (at least) scenarios where people are
using one table for both nat/pipe instances (and keep nat ids in sync
with pipe ones). For example:

ipfw table 1 add 10.0.10.0/24 110
ipfw table 1 add 10.0.20.0/24 120

ipfw add 100 nat tablearg from table(1) to any via vlanX in
..
ipfw add 500 pipe tablearg from table(1) to any via ix0 out

It looks like it is not so easy to bind values for given table to
different objects (or different tasks) (and lack of compatibility kills
hope for MFC).

Ideas?






___
freebsd-ipfw@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-ipfw
To unsubscribe, send any mail to freebsd-ipfw-unsubscr...@freebsd.org


Re: kern/122963: [ipfw] tcpdump does not show packets redirected by #39;ipfw fwd#39; on proper interface

2014-01-16 Thread Alexander V. Chernikov
The following reply was made to PR kern/122963; it has been noted by GNATS.

From: Alexander V. Chernikov melif...@freebsd.org
To: bug-follo...@freebsd.org, zub...@advancedhosters.com
Cc:  
Subject: Re: kern/122963: [ipfw] tcpdump does not show packets redirected
 by #39;ipfw fwd#39; on proper interface
Date: Thu, 16 Jan 2014 15:09:46 +0400

 This is not a bug.
 
 You're adding fwd rule which forwards outgoing packet back to the local 
 system (since fwd address is em0 address).
 That's why you're not seeing packet on the wire.
___
freebsd-ipfw@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-ipfw
To unsubscribe, send any mail to freebsd-ipfw-unsubscr...@freebsd.org


Re: ipfw table add problem

2013-11-24 Thread Alexander V. Chernikov
On 24.11.2013 19:43, Özkan KIRIK wrote:
 Hi,
 
 I tested patch. This patch solves, ipfw table 1 add 4899
Ok. So I'll commit this fix soon.
 
 But, ipfw table 1 add 10.2.3.01 works incorrectly.
 output is below.
 # ./ipfw table 1 flush
 # ./ipfw table 1 add 10.2.3.01
inet_pton() does not recognize this as valid IPv4 address, so it is
treated as usigned unteger key. It looks like this behavior is mentioned
in STANDARDS section.
 # ./ipfw table 1 list
 0.0.0.10/32 0
 
 
 
 
 On Sat, Nov 23, 2013 at 11:09 PM, Alexander V. Chernikov
 melif...@ipfw.ruwrote:
 
 On 19.11.2013 23:55, ᅱzkan KIRIK wrote:
 Hi,

 I'm using kernel FreeBSD 10.0-BETA3 #2 r257635 kernel. I am trying
 to add port number to ipfw tables. But there is something strange
 : Problem is easily repeatable.

 #ipfw table 1 flush #ipfw table 1 add 4899 #ipfw table 1 list ::/0
 0

 #ipfw table 1 flush #ipfw table 1 add 10.2.3.01   ( not
 10.0.0.1,   the last 1 has 0 as prefix ) #ipfw table 1 list ::/0 0

 #ipfw table 1 delete ::/0 ipfw: setsockopt(IP_FW_TABLE_XDEL): No
 such process


 I guess that, this problem is related to radix mask calculation
 problem/fix.
 Hello.
 I'm sorry, it seems that key lookups were broken for quite a long time.
 
 Can you apply attached patch, rebuild ipfw(8) binary and see if this
 helps?
 
 

 Is there a quick solution for this. Best, regards,
 ___
 freebsd-ipfw@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-ipfw To
 unsubscribe, send any mail to
 freebsd-ipfw-unsubscr...@freebsd.org

 

 ___
 freebsd-ipfw@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-ipfw
 To unsubscribe, send any mail to freebsd-ipfw-unsubscr...@freebsd.org
 




signature.asc
Description: OpenPGP digital signature


Re: ipfw table add problem

2013-11-23 Thread Alexander V. Chernikov
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 19.11.2013 23:55, ᅱzkan KIRIK wrote:
 Hi,
 
 I'm using kernel FreeBSD 10.0-BETA3 #2 r257635 kernel. I am trying
 to add port number to ipfw tables. But there is something strange
 : Problem is easily repeatable.
 
 #ipfw table 1 flush #ipfw table 1 add 4899 #ipfw table 1 list ::/0
 0
 
 #ipfw table 1 flush #ipfw table 1 add 10.2.3.01   ( not
 10.0.0.1,   the last 1 has 0 as prefix ) #ipfw table 1 list ::/0 0
 
 #ipfw table 1 delete ::/0 ipfw: setsockopt(IP_FW_TABLE_XDEL): No
 such process
 
 
 I guess that, this problem is related to radix mask calculation
 problem/fix.
Hello.
I'm sorry, it seems that key lookups were broken for quite a long time.

Can you apply attached patch, rebuild ipfw(8) binary and see if this
helps?


 
 Is there a quick solution for this. Best, regards, 
 ___ 
 freebsd-ipfw@freebsd.org mailing list 
 http://lists.freebsd.org/mailman/listinfo/freebsd-ipfw To
 unsubscribe, send any mail to
 freebsd-ipfw-unsubscr...@freebsd.org
 

-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.20 (FreeBSD)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iEYEARECAAYFAlKRGZIACgkQwcJ4iSZ1q2n0hgCgkiqRewC61LptUaG4ejvHIg0q
PawAoID3nfNxh3sTOVE/iKNtfjHpl9u0
=6GdO
-END PGP SIGNATURE-
Index: sbin/ipfw/ipfw2.c
===
--- sbin/ipfw/ipfw2.c	(revision 258494)
+++ sbin/ipfw/ipfw2.c	(working copy)
@@ -4281,6 +4281,7 @@ table_fill_xentry(char *arg, ipfw_table_xentry *xe
 *pkey = htonl(key);
 type = IPFW_TABLE_CIDR;
 addrlen = sizeof(uint32_t);
+masklen = 32;
 			}
 		}
 	}
___
freebsd-ipfw@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-ipfw
To unsubscribe, send any mail to freebsd-ipfw-unsubscr...@freebsd.org

Re: [patch] ipfw interface tracking and opcode rewriting

2013-04-24 Thread Alexander V. Chernikov

On 24.04.2013 23:09, Luigi Rizzo wrote:

On Wed, Apr 24, 2013 at 08:46:01PM +0400, Alexander V. Chernikov wrote:

On 24.04.2013 20:23, Luigi Rizzo wrote:

...

vesrion) in the middle of the next week.

hmmm this is quite a large change, and from the description it
is a bit unclear to me how the opcode rewriting thing relates to
the use of strings vs index for name matching.

sorry, I havent't describe this explicitly.
Index matching is done via storing interface index in in p.glob field of
ipfw_insn_if instruction.

understood. the reasons why i did not use the index is that
one could specify a non-existing interface name, and also interfaces
can be renamed. If you want to use indexses, you should add
(perhaps you do, i haven't checked)

Yes, this is done (without 'good' renaming handling), but still.

hooks to the interface add/rename/delete code in order to
update the ruleset upon changes on the if list, and it
seemed to me a bad idea to add this dependency
(lockingwise, too).

Really, with 16-byte fixed size interface names, the match
is as simple as this:

 #if CAN_DO_FAST_MATCH  IFNAMSIZ == 16 /* archs with no align 
requirements */
{
uint64_t *a = (uint64_t *)ifp-if_xname;
uint64_t *b = (uint64_t *)cmd-name;
if (a[0] == b[0]  a[1] == b[1])
return 1;
}
 #else
if (strncmp(ifp-if_xname, cmd-name, IFNAMSIZ) == 0)
return 1
 #endif

(assuming the names are zero-padded, which should be the case already).
Since you have the measurement infrastructure in place, perhaps
you have an easy way to try this patch and see how effective
it is in terms of performance.

I'll try this tomorrow, thanks.



Additionally, i wonder if there isn't a better way to replace strncmp
with some two 64-bit comparisons (the name is 16 bytes) by making
sure that the fields are zero-padded and suitably aligned.
At this point, on the machines you care about (able to sustain
1+ Mpps) the two comparison should have the same cost as
the index comparison, without the need to track update in the names.

Well, actually I'm thinking of the next 2 steps:
1) making kernel rule header more compact (20 bytes instead of 48) and
making it invisible for userland.
This involves rule counters to be stored separately (and possibly as
pcpu-based ones).
2) since ruleset is now nearly readonly and more or less compact we can
try to store it in
contiguous address space to optimize cache line usage.

certainly a worthwhile goal (also using gleb's new counters)
but i suspect that compacting rules are a second order effect.
I a bit skeptical they make a big difference on the in-kernel
version of ipfw. You might see some difference in the
My current numbers are ~5mpps of IPv4 forwarding with ipfw turned on (1 
rule) for vlans over ixgbe, with 60% cpu usage (2xE5646).

For lagg with 2x ixgbe it is ~7mpps with the same 60% usage.
(And, say, 70% of CPU usage on our production is ipfw, despite low 
number of rules).

userspace version, which runs on top of netmap.
We are preparing to move forward in this direction (and thinking of 
20-30mpps as our goal).
(And I hope some changes of kernel-based version can migrate to userland 
one :))


cheers
luigi



___
freebsd-ipfw@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-ipfw
To unsubscribe, send any mail to freebsd-ipfw-unsubscr...@freebsd.org


Re: IPFW tables troubleIn-Reply-To=4fb49f70.2000...@freebsd.org

2013-03-22 Thread Alexander V. Chernikov

On 20.03.2013 01:49, naPtu 3ah wrote:

  problem is still here

router5:/etc@[23:05] # ipfw show 12000-12200
12101  96   7236 count ip from any to 91.222.49.77 out via em0
12102   116147632355 allow ip from any to table(11) out via em0
12140   0  0 count ip from any to 91.222.49.77 out via em0

router5:/etc@[23:05] # ipfw table 11 list
91.222.49.26/32 0
router5:/etc@[23:06] # ipfw table 11 flush
router5:/etc@[23:06] # ipfw table 11 flush
router5:/etc@[23:06] # ipfw table 11 list
91.222.49.26/32 0
router5:/etc@[23:06] # ipfw table 11 delete 91.222.49.26/32
ipfw: setsockopt(IP_FW_TABLE_DEL): No such process
router5:/etc@[23:06] # ipfw table 11 list
91.222.49.26/32 0
router5:/etc@[23:06] # ipfw table 11 flush
router5:/etc@[23:07] # ipfw table 11 list
91.222.49.26/32 0
router5:/etc@[23:07] # uname -a
FreeBSD router5 8.3-RELEASE-p5 FreeBSD 8.3-RELEASE-p5 #3: Tue Feb  5 06:55:47 
EET 2013 r...@icenet.net.ua:/usr/obj/usr/src/sys/ICENET3  i386
Can you please update to recent -STABLE (or at least apply attached 
simple patch) and see if the problem remains?



___
freebsd-ipfw@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-ipfw
To unsubscribe, send any mail to freebsd-ipfw-unsubscr...@freebsd.org


Index: ip_fw_table.c
===
--- ip_fw_table.c   (revision 232438)
+++ ip_fw_table.c   (working copy)
@@ -96,7 +96,7 @@ ipfw_add_table_entry(struct ip_fw_chain *ch, uint1
struct table_entry *ent;
struct radix_node *rn;
 
-   if (tbl = IPFW_TABLES_MAX)
+   if ((tbl = IPFW_TABLES_MAX) || (mlen  32))
return (EINVAL);
rnh = ch-tables[tbl];
ent = malloc(sizeof(*ent), M_IPFW_TBL, M_NOWAIT | M_ZERO);
___
freebsd-ipfw@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-ipfw
To unsubscribe, send any mail to freebsd-ipfw-unsubscr...@freebsd.org

[patch] setting/matching DSCP with ipfw

2013-03-14 Thread Alexander V. Chernikov
Hello list!

This is the obvious thing which should be done at least 5 years ago.
There are several PRs like kern/102471 and kern/121122 with similar
functionality.

Given patch adds setting DSCP support (O_SETDSCP) which works for both
IPv4 and IPv6 packets. Fast checksum recalculation (RFC 1624) is done in
former case. Dscp can be specified by name (AFXY, CSX, BE, EF), by value
(0..63) or via tablearg.

Matching DSCP is done via another opcode (O_DSCP) which accepts several
classes at once (af11,af22,be). Classes are stored in bitmask (2 u32 words).

(Btw, current O_TOS can be modified to transparently match single DSCP
point, probably later on..)

Example:
00050  675  37800 setdscp ef ip from any to 2a02:978:11::/64 dscp be
ipfw add 100 count ip from any to any dscp af11,cs3
00100 count ip from any to any dscp af11,cs3

I'm planning to commit updated patch (docs, some style(9)) on Mon 18 if
there are no objections.
Index: sys/netpfil/ipfw/ip_fw2.c
===
--- sys/netpfil/ipfw/ip_fw2.c   (revision 248114)
+++ sys/netpfil/ipfw/ip_fw2.c   (working copy)
@@ -1624,6 +1624,32 @@ do { 
\
flags_match(cmd, ip-ip_tos));
break;
 
+   case O_DSCP:
+   {
+   uint32_t *p;
+   uint16_t x;
+
+   p = ((ipfw_insn_u32 *)cmd)-d;
+
+   if (is_ipv4)
+   x = ip-ip_tos  2;
+   else if (is_ipv6) {
+   uint8_t *v;
+   v = ((struct ip6_hdr *)ip)-ip6_vfc;
+   x = (*v  0x0F)  2;
+   v++;
+   x |= *v  6;
+   } else
+   break;
+
+   /* DSCP bitmask is stored as low_u32 high_u32 */
+   if (x  32)
+   match = *(p + 1)  (1  (x - 32));
+   else
+   match = *p  (1  x);
+   }
+   break;
+
case O_TCPDATALEN:
if (proto == IPPROTO_TCP  offset == 0) {
struct tcphdr *tcp;
@@ -2353,6 +2379,32 @@ do { 
\
break;
}
 
+   case O_SETDSCP: {
+   uint16_t code;
+
+   code = IP_FW_ARG_TABLEARG(cmd-arg1)  0x3F;
+   l = 0;  /* exit inner loop */
+   if (is_ipv4) {
+   uint16_t a;
+
+   a = ip-ip_tos;
+   ip-ip_tos = (code  2) | (ip-ip_tos 
 0x03);
+   a += ntohs(ip-ip_sum) - ip-ip_tos;
+   ip-ip_sum = htons(a);
+   } else if (is_ipv6) {
+   uint8_t *v;
+
+   v = ((struct ip6_hdr *)ip)-ip6_vfc;
+   *v = (*v  0xF0) | (code  2);
+   v++;
+   *v = (*v  0x3F) | ((code  0x03)  6);
+   } else
+   break;
+
+   IPFW_INC_RULE_COUNTER(f, pktlen);
+   break;
+   }
+
case O_NAT:
if (!IPFW_NAT_LOADED) {
retval = IP_FW_DENY;
Index: sys/netpfil/ipfw/ip_fw_log.c
===
--- sys/netpfil/ipfw/ip_fw_log.c(revision 248114)
+++ sys/netpfil/ipfw/ip_fw_log.c(working copy)
@@ -292,12 +292,10 @@ ipfw_log(struct ip_fw *f, u_int hlen, struct ip_fw
altq-qid);
cmd += F_LEN(cmd);
}
-   if (cmd-opcode == O_PROB)
+   if (cmd-opcode == O_PROB || cmd-opcode == O_TAG ||
+   cmd-opcode == O_SETDSCP)
cmd += F_LEN(cmd);
 
-   if (cmd-opcode == O_TAG)
-   cmd += F_LEN(cmd);
-
action = action2;
switch (cmd-opcode) {
case O_DENY:
Index: sys/netpfil/ipfw/ip_fw_sockopt.c

Re: IPv6 addresses in tables not always working

2012-12-25 Thread Alexander V. Chernikov

On 25.12.2012 18:58, Fabian Wenk wrote:

Hello

To test tables with IPv6 for use with fail2ban (see thread IPv6
Support [1]), I tried it out on a FreeBSD 9.1-RELEASE (r244668) system.
Not all possible rules with tables which include IPv6 addresses seem to
work.

   [1] http://sourceforge.net/mailarchive/message.php?msg_id=29387087

For fail2ban it will both be possible, using mixed tables with IPv4 and
IPv6 addresses and separate tables with only IPv4 or IPv6 addresses. So
I tried a few variants.

First I created 3 different tables (IPv4 only, IPv6 only, IPv4 and IPv6
mixed), this worked so far:

...


Then I deleted the IPv4 and IPv6 only rules to only test with the mixed
IPv4 and IPv6 table(46):

root@freebsd9:~ # ipfw delete 1 2
root@freebsd9:~ # ipfw show | head -1
3  0   0 unreach port tcp from table(46) to me dst-port 22 in
root@freebsd9:~ #

And again testing from the remote system, the timeouts are still with
the same difference for IPv4 and IPv6, but the message for IPv6 is now
different:


unreach  unreach6 does different things:
Former implies O_REJECT token (which is ipv4 only) while the latter 
calls O_UNREACH6 (which is ipv6 only).


I'm not sure why we're utilizing O_UNREACH6 instead of re-using O_REJECT..


root@freebsd9:~ # ipfw show | head -1
3 12 872 unreach port tcp from table(46) to me dst-port 22 in
root@freebsd9:~ #


I also tried some other rules, which would be use cases for my setup
with fail2ban, but not all of them work:

freebsd9:~ # ipfw add 4 deny ip6 from table\(6\) to me6 22 in
ipfw: bad address table(6)
root@freebsd9:~ #
Yep, this is a known problem (and some similar still remains). Fixed in 
r240892 (r241883 for stable/9).


...


To help collect the information regarding IPv6 support in ipfw tables,
what other rules should I test? Or is this already enough information
for any FreeBSD IPFW developer to be able to locate and probably fix
this issues?

I guess it is probably better to first collect some more information
regarding IPv6 and tables here on the list and then create a
corresponding PR later on for it.


bye
Fabian
___
freebsd-ipfw@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-ipfw
To unsubscribe, send any mail to freebsd-ipfw-unsubscr...@freebsd.org





--
WBR, Alexander

___
freebsd-ipfw@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-ipfw
To unsubscribe, send any mail to freebsd-ipfw-unsubscr...@freebsd.org


Re: [RFC] IPv6 ifaddr hash

2012-12-08 Thread Alexander V. Chernikov

On 07.12.2012 16:27, Andrey V. Elsukov wrote:

Hi All,

We have discovered that ipfw(4) shows very low performance results with
our rules. One of the biggest problems is rules with O_IP6_XXX_ME
opcode. They checks match or not match packet's addresses with locally
configured IPv6 addresses.

For IPv4 we have an in_ifaddr hash for the quick search an address, but
not for the IPv6. So, I have implemented the first patch based on the
code for the IPv4, but there are several questions I want to discuss.

The patch is here:
http://people.freebsd.org/~ae/in6_ifaddrhash.diff

1. The hash size. I made it the same what IPv4 has. But I think 512
buckets is too many.
While the same IPv6 configuration can have up to x2 addresses as in IPv4 
(link-local addresses), 512 is really too much, maybe 64, or 128

be better for common-use case?


2. What hash function is better to use?

We've got at least 3 (known to me) hashes in our kernel:
ng_netflow one, flowtable and in ipfw.

Can you provide some benchmarks and hashing effectiveness for some 
real-world data for those?


3. Using the whole 128 bit of address to hash seems like overkill.


There are people using IPv6 address space just as plain IPv4, e.g:

XX:YY:ZZ::1, XX:YY:ZZ::2, ... ::n, or even XX:YY:ZZ::A.B.C.D, so hashing 
upper 64 bits can lead to collisions.


Hashing lower 64 is more promising, but there can be other use cases, too.

Imho we can just test test performance of hashing functions and see how 
much is the different and is it worth talking.



There is another problem: link-local addresses. They are all the same, 
(or there are some small number of different groups) so one (or more) 
bucket will always be filled by them.


This can result in
* some searches for global addresses being much slower
* IPv6 code accepting packet to link-local address of the other 
interface ( RFC 4291 sec 2.5.6 )


We can workaround first problem by adding global unicast to list head, 
and link-local - to list tail, but this leaves us with the second one.


One of possible solutions is to add interface index as another parameter 
to hash function, and use it IFF address is site-local.






___
freebsd-ipfw@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-ipfw
To unsubscribe, send any mail to freebsd-ipfw-unsubscr...@freebsd.org


Re: [CFT] Virtual BPF interfaces

2012-12-03 Thread Alexander V. Chernikov

On 03.12.2012 12:11, Gleb Smirnoff wrote:

On Sun, Dec 02, 2012 at 04:48:18AM +0400, Alexander V. Chernikov wrote:
A On 10.06.2012 18:20, Alexander V. Chernikov wrote:
A  On 27.04.2012 03:44, Hiroki Sato wrote:
A  Alexander V. Chernikovmelif...@freebsd.org wrote
A  in4f96e71b.9020...@freebsd.org:
A 
A  me On 24.04.2012 21:05, Hiroki Sato wrote:
A 
A  Proof-of-concept patch attached.
A
A Hopefully, libcap code is easily extendable.
A New version attached:
A * BPF code is now able to use 'virtual' interfaces without real ifnet
A * New bpfattach3() / bpfdetach3() routines were added to attach virtual
A ifaces
A * New BIOCGIFLIST ioctl is added to permit userland to retrieve
A available virtual interfaces
A * freebsd-specific 'platform_finddevs' version is added to libpcap code
A (new file)
A
A There are some rough edges (conditional code in pcap-bpf.c, lack of
A documentation, maybe some style issues), but generally it seems to work
A and does not interfere with contrib/ code much (from my point of view).
A
A ipfw log device was converted to use new bpf(4) api, see attached patch.

Nice proof of concept, Alexander!

What does prevent us from unifing all bpf providers to be virtual in
current terms? I think if we finish divorce between ifnet and bpf, the code
would get simplier and you can proceed further with reducing locking
overhead.


We have to jump from ifnet to the list of per-ifnet BPF consumers 
somehow, so I'm not sure if we can do much more here. BPF itself doesn't 
require much from parent ifnet.


What I really want to do next is the following:

1) Make BPF_PEERS_PRESENT(ifp) to be (ifp-if_bpf != NULL). This saves 
some processing time and permits 'bpf_if' to be be totally opaque 
without any hacks.
2) Set if_bpf pointer IFF there are some consumers (and set it back to 
NULL when all consumers are detached). This should work well for 'main' 
BPF DLT, but single (currently, 802.11) interface can hold more than one 
DLTs. Probably we can save dst pointer passed to bpfattach2() to given 
bpf_if structure, and set this value instead of -if_bpf.
This, however, can lead to hard-to-find problems, since bpfattach[2] is 
usually not called by driver directly.










--
WBR, Alexander


___
freebsd-ipfw@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-ipfw
To unsubscribe, send any mail to freebsd-ipfw-unsubscr...@freebsd.org


Re: [CFT] ipfw SMP-ready dynamic states

2012-11-27 Thread Alexander V. Chernikov

On 27.11.2012 09:54, Gleb Smirnoff wrote:

On Tue, Nov 27, 2012 at 02:30:51AM +0400, Alexander V. Chernikov wrote:
A On 14.11.2012 19:47, Gleb Smirnoff wrote:
A  On Tue, Nov 13, 2012 at 11:28:23PM +0400, Alexander V. Chernikov wrote:
A  A  So, we can do the following:
A  A  1) lock increments/decrements via some separate mutex
A  A  2) do nothing
A  A  3) take some combined approach:
A 
A  4) Take it via uma_zone_getcur(ipfw_dyn_rule_zone);
A It acquired zone lock to collect per-cpu item data, but
A uma_zone_set_max() did the trick.
A 
A
A Patch updated:
A * UMA zone is now allocated per-VNET instance

Why? This only leads to more waste in allocator.

To be able to enforce state limit per-instance as it currently works.





--
WBR, Alexander


___
freebsd-ipfw@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-ipfw
To unsubscribe, send any mail to freebsd-ipfw-unsubscr...@freebsd.org


Re: [CFT] ipfw SMP-ready dynamic states

2012-11-26 Thread Alexander V. Chernikov

On 14.11.2012 19:47, Gleb Smirnoff wrote:

On Tue, Nov 13, 2012 at 11:28:23PM +0400, Alexander V. Chernikov wrote:
A  So, we can do the following:
A  1) lock increments/decrements via some separate mutex
A  2) do nothing
A  3) take some combined approach:

4) Take it via uma_zone_getcur(ipfw_dyn_rule_zone);

It acquired zone lock to collect per-cpu item data, but
uma_zone_set_max() did the trick.




Patch updated:
* UMA zone is now allocated per-VNET instance
* dyn_max limits is now enforced by UMA code
* some (serious) bugs removed from limiting code


If there are no objections, I plan to commit this patch to base on Thursday.

Index: sys/netpfil/ipfw/ip_fw_sockopt.c
===
--- sys/netpfil/ipfw/ip_fw_sockopt.c(revision 243526)
+++ sys/netpfil/ipfw/ip_fw_sockopt.c(working copy)
@@ -382,7 +382,7 @@ del_entry(struct ip_fw_chain *chain, uint32_t arg)
continue;
l = RULESIZE(rule);
chain-static_len -= l;
-   ipfw_remove_dyn_children(rule);
+   ipfw_expire_dyn_rules(chain, rule, RESVD_SET);
rule-x_next = chain-reap;
chain-reap = rule;
}
@@ -925,7 +925,7 @@ ipfw_getrules(struct ip_fw_chain *chain, void *buf
dst-timestamp += boot_seconds;
bp += l;
}
-   ipfw_get_dynamic(bp, ep); /* protected by the dynamic lock */
+   ipfw_get_dynamic(chain, bp, ep); /* protected by the dynamic lock */
return (bp - (char *)buf);
 }
 
Index: sys/netpfil/ipfw/ip_fw_private.h
===
--- sys/netpfil/ipfw/ip_fw_private.h(revision 243526)
+++ sys/netpfil/ipfw/ip_fw_private.h(working copy)
@@ -175,7 +175,9 @@ enum { /* result for matching dynamic rules */
  * and only to release the result of lookup_dyn_rule().
  * Eventually we may implement it with a callback on the function.
  */
-void ipfw_dyn_unlock(void);
+struct ip_fw_chain;
+void ipfw_expire_dyn_rules(struct ip_fw_chain *, struct ip_fw *, int);
+void ipfw_dyn_unlock(ipfw_dyn_rule *q);
 
 struct tcphdr;
 struct mbuf *ipfw_send_pkt(struct mbuf *, struct ipfw_flow_id *,
@@ -185,11 +187,9 @@ int ipfw_install_state(struct ip_fw *rule, ipfw_in
 ipfw_dyn_rule *ipfw_lookup_dyn_rule(struct ipfw_flow_id *pkt,
int *match_direction, struct tcphdr *tcp);
 void ipfw_remove_dyn_children(struct ip_fw *rule);
-void ipfw_get_dynamic(char **bp, const char *ep);
+void ipfw_get_dynamic(struct ip_fw_chain *chain, char **bp, const char *ep);
 
-void ipfw_dyn_attach(void);/* uma_zcreate  */
-void ipfw_dyn_detach(void);/* uma_zdestroy ... */
-void ipfw_dyn_init(void);  /* per-vnet initialization */
+void ipfw_dyn_init(struct ip_fw_chain *);  /* per-vnet initialization */
 void ipfw_dyn_uninit(int); /* per-vnet deinitialization */
 int ipfw_dyn_len(void);
 
@@ -259,6 +259,9 @@ struct sockopt; /* used by tcp_var.h */
 #define IPFW_WLOCK(p) rw_wlock((p)-rwmtx)
 #define IPFW_WUNLOCK(p) rw_wunlock((p)-rwmtx)
 
+#defineIPFW_UH_RLOCK_ASSERT(_chain)rw_assert((_chain)-uh_lock, 
RA_RLOCKED)
+#defineIPFW_UH_WLOCK_ASSERT(_chain)rw_assert((_chain)-uh_lock, 
RA_WLOCKED)
+
 #define IPFW_UH_RLOCK(p) rw_rlock((p)-uh_lock)
 #define IPFW_UH_RUNLOCK(p) rw_runlock((p)-uh_lock)
 #define IPFW_UH_WLOCK(p) rw_wlock((p)-uh_lock)
Index: sys/netpfil/ipfw/ip_fw_dynamic.c
===
--- sys/netpfil/ipfw/ip_fw_dynamic.c(revision 243526)
+++ sys/netpfil/ipfw/ip_fw_dynamic.c(working copy)
@@ -95,7 +95,7 @@ __FBSDID($FreeBSD$);
  * The lifetime of dynamic rules is regulated by dyn_*_lifetime,
  * measured in seconds and depending on the flags.
  *
- * The total number of dynamic rules is stored in dyn_count.
+ * The total number of dynamic rules is equal to UMA zone items count.
  * The max number of dynamic rules is dyn_max. When we reach
  * the maximum number of rules we do not create anymore. This is
  * done to avoid consuming too much memory, but also too much
@@ -111,38 +111,34 @@ __FBSDID($FreeBSD$);
  * passes through the firewall. XXX check the latter!!!
  */
 
+struct ipfw_dyn_bucket {
+   struct mtx  mtx;/* Bucket protecting lock */
+   ipfw_dyn_rule   *head;  /* Pointer to first rule */
+};
+
 /*
  * Static variables followed by global ones
  */
-static VNET_DEFINE(ipfw_dyn_rule **, ipfw_dyn_v);
-static VNET_DEFINE(u_int32_t, dyn_buckets);
+static VNET_DEFINE(struct ipfw_dyn_bucket *, ipfw_dyn_v);
+static VNET_DEFINE(u_int32_t, dyn_buckets_max);
 static VNET_DEFINE(u_int32_t, curr_dyn_buckets);
 static VNET_DEFINE(struct callout, ipfw_timeout);
 #defineV_ipfw_dyn_vVNET(ipfw_dyn_v)
-#defineV_dyn_buckets   VNET(dyn_buckets)
+#define

Re: [RFC] Enabling IPFIREWALL_FORWARD in run-time

2012-10-19 Thread Alexander V. Chernikov

On 19.10.2012 18:05, Andre Oppermann wrote:

On 19.10.2012 14:18, Andrey V. Elsukov wrote:

On 19.10.2012 16:02, Andre Oppermann wrote:
http://people.freebsd.org/~ae/pfil_forward.diff


Also we have done some tests with the ixia traffic generator connected
via 10G network adapter. Tests have show that there is no visible
difference, and there is no visible performance degradation.

Any objections?


No objection as such.  However I don't entirely agree with the
naming of pfil_forward.  The functionality is specific to IPFW
and TCP, it's doing transparent interjected termination of tcp
connections on the local host while keeping the original IP
addresses and port numbers visible in netstat output.

So it's a feature of IPFW/IP and should be fitted in there for
sysctl name and .h files instead of pfil.


Actually it can be used not only by ipfw. We already have
net.inet.ip.forwarding and net.inet6.ip6.forwarding variables, and
placing it into net.inet.ip.fw is undesirable, because we can have
kernel without ipfw. So, i decided to choose pfil, because it could not
work without pfil.


Again, it's not a property of pfil.  It's a property of IP and it

Not exactly. It is currently supported in both IPv4 and IPv6.

should live there from a configuration point of view. Other firewalls
than ipfw don't make use of it.

You could rename it to transparent connection proxy or some such.
fwd is widely used as policy-based routing, so it is not just 
upper-layer TCP feature.






--
WBR, Alexander

___
freebsd-ipfw@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-ipfw
To unsubscribe, send any mail to freebsd-ipfw-unsubscr...@freebsd.org


Re: kern/156770: [ipfw] [dummynet] [patch]: performance improvement and several extensions

2012-07-01 Thread Alexander V. Chernikov

On 01.07.2012 23:09, Luigi Rizzo wrote:

On Sun, Jul 01, 2012 at 03:54:35PM +, melif...@freebsd.org wrote:

Synopsis: [ipfw] [dummynet] [patch]: performance improvement and several 
extensions

Responsible-Changed-From-To: freebsd-ipfw-melifaro
Responsible-Changed-By: melifaro
Responsible-Changed-When: Sun Jul 1 15:54:17 UTC 2012
Responsible-Changed-Why:
Take

http://www.freebsd.org/cgi/query-pr.cgi?pr=156770


Alex,
Not sure if you're speaking to me, since both submitter and I are 
Alexanders :) However I'll try to answer some of the questions.

please any ipfw-related patch through me before committing.

On this specific PR i have some comments and several concerns.

First, as mentioned in the thread, some specific features (e.g. ftags)
might be of interest, but the fact that this is a single monolitic patch
make it hard to apply and review. Especially, at least judging from the
description, i believe some of the changes replicate features that
were already inserted around 2009 and later (in then-head).
We already got private discussion resulting in preparation of some most 
interesting (at least to me) parts of code to be split into different 
patches and remade to work on -current.


Particularly I'm interested in rule indexes mostly.



On the negative side:
- documentation on new features is completely absent. Just a brief mention
   in the manpage of ftag/funtag, a short comment in a C source code.

- the way some features are implemented is through adding new IOCTLs,
   which is the wrong way of doing things. In the 2009 rewrite (ipfw3)
   i tried to use a single ioctl which carries tagged messages
   for the various requests (similar to the microinstructions which make
   up a rule) so the code is easier to extend without breaking ABIs.
   Please follow the new style if you need to add commands.
IP_FW3 is already used in ipv6 tables code, so there are some ipfw(8) 
and kernel code to reuse.


- can you please split the patch in individual components, and
   make sure that they not replicate functions already existent
   (or if they do, are they an improvement) ? I am especially
   referring to indexed skipto

- a large number of changes to the userspace code replaces errx()
   with return my_err(...) . I might agree on the principle, but
   I'd like to see a few notes on why this change is required,
   and whether it can be applied independently of the others.

cheers
luigi



___
freebsd-ipfw@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-ipfw
To unsubscribe, send any mail to freebsd-ipfw-unsubscr...@freebsd.org


Re: kern/169206: [ipfw] ipfw does not flush entries in table

2012-06-20 Thread Alexander V. Chernikov
The following reply was made to PR kern/169206; it has been noted by GNATS.

From: Alexander V. Chernikov melif...@freebsd.org
To: bug-follo...@freebsd.org, pi...@pixel.org.pl
Cc:  
Subject: Re: kern/169206: [ipfw] ipfw does not flush entries in table
Date: Wed, 20 Jun 2012 18:29:18 +0400

 Is it possible for you to upgrade this box to latest 8-STABLE (at least 
 r237309) and check if this helps?
___
freebsd-ipfw@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-ipfw
To unsubscribe, send any mail to freebsd-ipfw-unsubscr...@freebsd.org


Re: VNET

2012-06-20 Thread Alexander V. Chernikov

On 19.06.2012 12:56, Sami Halabi wrote:

Hi,

I want to ask aout VNET jails, i read somehwre that I'm able to run IPFW,
but not PF firewall in a cnet jail.
is that correct?

i want a vnet jail basicly for nat, so natd with ipfw + ipdivert is my

1) You can do nat without vnet.
2) ipfw nat is currently the easiest way to do nat.


choice? or i can use pf somehow, I never used pf before,
so i would like some advise here...

Thanks in advance,




--
WBR, Alexander
___
freebsd-ipfw@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-ipfw
To unsubscribe, send any mail to freebsd-ipfw-unsubscr...@freebsd.org


Re: kern/156770: [ipfw] [dummynet] [patch]: performance improvement and several extensions

2012-06-15 Thread Alexander V. Chernikov
The following reply was made to PR kern/156770; it has been noted by GNATS.

From: Alexander V. Chernikov melif...@freebsd.org
To: bug-follo...@freebsd.org, al...@alter.org.ua
Cc:  
Subject: Re: kern/156770: [ipfw] [dummynet] [patch]: performance improvement
 and several extensions
Date: Fri, 15 Jun 2012 16:35:56 +0400

 Hello Alexandr!
 
 I'm afraid singe huge patch for legacy release is not the promising start.
 Since development model assumes new code being committed to -current 
 first, you should probably port these features to -current (it does not 
 differ from 8-STABLE much).
 It is also much easier to discuss/import features by small chunks 
 instead of single huge change, so splitting every feature into separate 
 diff is possibly  a good thing to do.
 
 Please note that some of functionality (skipto tablearg, interface 
 tables are already implemented in a different way).
 
 Personally for me index table for fast skipto/pipes, mapped tables and 
 io_fast patch looks very promising, so we can discuss directly if you're 
 interested.
 
 
___
freebsd-ipfw@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-ipfw
To unsubscribe, send any mail to freebsd-ipfw-unsubscr...@freebsd.org


Re: ipfw rules consuming CPU

2012-06-09 Thread Alexander V. Chernikov

On 09.06.2012 15:19, Sami Halabi wrote:

Hi,
all rules togther less than 80 rules

However, it is too much.
You should reduce this to 10 rules or less (at least for main traffic flow).


(Btw, there is related http://wiki.freebsd.org/NetworkPerformanceTuning 
wiki page)





how tablearg helps this? each ip  pipe (up  down) are unique...

ipfw table 1 add 182.46.92.0/24 1000
ipfw table 1 add XXX.XXX.XX.0/24 1001
..
ipfw table 2 add 182.46.92.0/24 1002
ipfw table 2 add XXX.XXX.XX.0/24 1003

ipfw add 4000 pipe tablearg from table(1) to any out xmit bce1
ipfw add 4100 pipe tablearg from any to table(1) in recv bce1


It is often a good idea to split in/out rules initially (e.g. skipto 
1 ip from any to any out)


You can send me your ipfw config and we can discuss it more detailed.



any other advices?

Sami

On Sat, Jun 9, 2012 at 1:15 PM, Alexander V. Chernikov
melif...@freebsd.org mailto:melif...@freebsd.org wrote:

On 09.06.2012 01:56, Sami Halabi wrote:

Hi,

I Manage a FreeBSD server as an edge router  firewall.

the setup has 10G interfaces (ixgbe-82599EB) and 1G
interfaces(em-82571EB
bce-BCM5709) connected to 10G/1G switches.

With the following setup i get higher cpu usage:
bce1-upstream provider with little bandwidth, so i use pipes to
limit
users, and subnets
ix0 - Internet Exchange

some rules.
.
.
.from 4000 starts pipes for specefic ips bandwidth allocations
0400062100530015845967300616 pipe 1003 ip from
182.46.92.13 to any
out xmit bce1
04100   412898975373064110648124 pipe 1004 ip from any to
182.46.92.13
in recv bce1

You should use pipe tablearg for that. Traversing 4k rules
effectively kills all performance.


.
.
.
.7000 is the wider pipeline for the whole block
0700091271547244651308720315 pipe 1000 ip from
182.46.92.0/24 http://182.46.92.0/24 to
any out xmit bce1
071004837016828 458027989917 pipe 1002 ip from any to
182.46.92.0/24 http://182.46.92.0/24 in recv bce1
last rule default to accept...

specefic pipes (1003-...) have limits say between 1-10Mbps, and
the wider
pipe (1000 and 1002) has a global limit of 40MBps that should be
reached by
all other non-specefic ips, config like this:
#Wide
ipfw pipe 1000 config bw 40Mbit/s queue 200Kbytes
ipfw pipe 1002 config bw 40Mbit/s queue 200Kbytes
#specefic
ipfw pipe 1003 config bw 9Mbit/s queue 200Kbytes
ipfw pipe 1004 config bw 9Mbit/s queue 200Kbytes
ipfw pipe 1005 config bw 3Mbit/s queue 200Kbytes
ipfw pipe 1006 config bw 3Mbit/s queue 200Kbytes
ipfw pipe 1007 config bw 5Mbit/s queue 200Kbytes
ipfw pipe 1008 config bw 5Mbit/s queue 200Kbytes
ipfw pipe 1009 config bw 10Mbit/s queue 200Kbytes
ipfw pipe 1010 config bw 10Mbit/s queue 200Kbytes


with this configuration when i have lots of traffic (3-6GB)
going via ix0
(not necessarly the ips described above, lets say to a server in
my net ip
1832.46.93.4 and users behind the Internet Exchange) i see high
cpu usage
(70-90%).

my first test was to: ipfw add 1 allow all from any to any, and
cpu usage
drops immediatly to 10-15%.
but that not why i want (i wantto keep thelimits) so I add rule
right
before 4000 and the cpu usage drops down to 10-20%:
03020 1669463072808 1493341413029803 allow ip from any to any
via ix0


Any advice why this happens? or should it be there in the first
place?
I use FreeBSD 8.1-R-p10-amd64.

Thanks in advance,



--
WBR, Alexander




--
Sami Halabi
Information Systems Engineer
NMS Projects Expert
FreeBSD SysAdmin Expert




--
WBR, Alexander
___
freebsd-ipfw@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-ipfw
To unsubscribe, send any mail to freebsd-ipfw-unsubscr...@freebsd.org


Re: IPFW tables trouble

2012-05-17 Thread Alexander V. Chernikov

On 16.05.2012 16:07, Daniel Kalchev wrote:

Hello,

I am having an persistent problem when using tables with ipfw. On a
number of routers, built with various FreeBSD versions, with ipfw as
loadable module or statically compiled, the problem remains the same.




 From time to time, ipfw spews errors like this:

Non-unique normal route, mask not entered
Non-unique normal route, mask not entered

or

rn_delete: couldn't find our annotation
rn_delete: couldn't find our annotation
rn_delete: couldn't find our annotation


It seems that under some conditions mask is passed incorrectly to radix 
code. Wrong mask can be generated by ipfw module if userland passes 
value larger that 32. What is funny that kernel still doesn't check mask 
value in case of IPv4.


Can you update your 9-stable, add something like the following:

Index: sys/netinet/ipfw/ip_fw_table.c
===
--- sys/netinet/ipfw/ip_fw_table.c  (revision 235530)
+++ sys/netinet/ipfw/ip_fw_table.c  (working copy)
@@ -153,6 +153,8 @@ ipfw_add_table_entry(struct ip_fw_chain *ch, uint1
case IPFW_TABLE_CIDR:
if (plen == sizeof(in_addr_t)) {
 #ifdef INET
+   if (mlen  32)
+   return (EINVAL);
ent = malloc(sizeof(*ent), M_IPFW_TBL, M_WAITOK 
| M_ZERO);

ent-value = value;
/* Set 'total' structure length */

and see if this helps?

The same idea applies to 7/8, hence the code is still different.




Sometimes, after such output, if one does:

ipfw table 1 flush
ipfw table 1 list

the output is non-empty. It should be empty, right?


Can you show an examples for such output ?

How often does this happen ?






This problem has troubled me for a number of years already.

Thanks in advance,
Daniel
___
freebsd-ipfw@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-ipfw
To unsubscribe, send any mail to freebsd-ipfw-unsubscr...@freebsd.org



___
freebsd-ipfw@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-ipfw
To unsubscribe, send any mail to freebsd-ipfw-unsubscr...@freebsd.org


Re: CFR: ipfw0 pseudo-interface clonable

2012-04-24 Thread Alexander V. Chernikov

On 24.04.2012 19:26, Hiroki Sato wrote:

Hi,

  I created the attached patch to make the current ipfw0
  pseudo-interface clonable.  The functionality of ipfw0 logging
  interface is not changed by this patch, but the ipfw0
  pseudo-interface is not created by default and can be created with
  the following command:

   # ifconfig ipfw0 create

  Any objection to commit this patch?  The primary motivation for this
  change is that presence of the interface by default increases size of
  the interface list, which is returned by NET_RT_IFLIST sysctl even
  when the sysadmin does not need it.  Also this pseudo-interface can
  confuse the sysadmin and/or network-related userland utilities like
  SNMP agent.  With this patch, one can use ifconfig(8) to
  create/destroy the pseudo-interface as necessary.


ipfw_log() log_if usage is not protected, so it is possible to trigger 
use-after-free.


Maybe it is better to have some interface flag which makes NET_RT_IFLIST 
skip given interface ?





-- Hiroki



--
WBR, Alexander
___
freebsd-ipfw@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-ipfw
To unsubscribe, send any mail to freebsd-ipfw-unsubscr...@freebsd.org


Re: CFR: ipfw0 pseudo-interface clonable

2012-04-24 Thread Alexander V. Chernikov

On 24.04.2012 21:05, Hiroki Sato wrote:

Alexander V. Chernikovmelif...@freebsd.org  wrote
   in4f96d11b.2060...@freebsd.org:

me  On 24.04.2012 19:26, Hiroki Sato wrote:
meHi,
me  
me  I created the attached patch to make the current ipfw0
me  pseudo-interface clonable.  The functionality of ipfw0 logging
me  interface is not changed by this patch, but the ipfw0
me  pseudo-interface is not created by default and can be created with
me  the following command:
me  
me   # ifconfig ipfw0 create
me  
me  Any objection to commit this patch?  The primary motivation for this
me  change is that presence of the interface by default increases size of
me  the interface list, which is returned by NET_RT_IFLIST sysctl even
me  when the sysadmin does not need it.  Also this pseudo-interface can
me  confuse the sysadmin and/or network-related userland utilities like
me  SNMP agent.  With this patch, one can use ifconfig(8) to
me  create/destroy the pseudo-interface as necessary.
me
me  ipfw_log() log_if usage is not protected, so it is possible to trigger
me  use-after-free.

  Ah, right.  I will revise lock handling and resubmit the patch.

me  Maybe it is better to have some interface flag which makes
me  NET_RT_IFLIST skip given interface ?

  I do not think so.  NET_RT_IFLIST should be able to list all of the
  interfaces because it is the purpose.

Okay, another try (afair already discussed somewhere):
Do we really need all BPF providers to have ifnets?
It seems that removing all bp_bif depends from BPF code is not so hard task.




-- Hiroki



--
WBR, Alexander
___
freebsd-ipfw@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-ipfw
To unsubscribe, send any mail to freebsd-ipfw-unsubscr...@freebsd.org


Re: Firewall Profiling.

2011-12-27 Thread Alexander V. Chernikov

On 27.12.2011 04:54, Pawel Tyll wrote:

Hi lists,

Are  there any profiling tools in the system or ports that would allow
me  to  determine how much processing is being done per packet and how
long  does  it  take? I would like to predict possible PPS load for my
system and perhaps locate and remove some bottlenecks.

Is  IPFW  efficient  enough  to  firewall  2x10GE  (in+out) interfaces
without  much  latency  increase,  when  running  on  modern  hardware
with Intel NICs? Majority of processing tasks would probably be setfib
according to matches in tables.
IPFW seems to add more or less constant overhead per rule. In our setup, 
~20 rules increase load by 100% (one core).  We are able to reach 10GE 
(1.1mpps) on some routers with most packets travelling 8-10 ipfw rules. 
However, even with ipfw add 1 allow ip from any to any
1.1 mpps routing utilizes E5645 by more that 80%. (with IGP routes in 
rtable only). YMMV, but 2x10G is too much at the moment even without ipfw.




Pawel.


___
freebsd-...@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org




--
WBR, Alexander
___
freebsd-ipfw@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-ipfw
To unsubscribe, send any mail to freebsd-ipfw-unsubscr...@freebsd.org


Re: Firewall Profiling.

2011-12-27 Thread Alexander V. Chernikov
Mike Tancsa wrote:
 On 12/27/2011 6:36 AM, Alexander V. Chernikov wrote:
 Is  IPFW  efficient  enough  to  firewall  2x10GE  (in+out) interfaces
 without  much  latency  increase,  when  running  on  modern  hardware
 with Intel NICs? Majority of processing tasks would probably be setfib
 according to matches in tables.
 IPFW seems to add more or less constant overhead per rule. In our setup,
 ~20 rules increase load by 100% (one core).  We are able to reach 10GE
 (1.1mpps) on some routers with most packets travelling 8-10 ipfw rules.
 However, even with ipfw add 1 allow ip from any to any
 1.1 mpps routing utilizes E5645 by more that 80%. (with IGP routes in
 rtable only). YMMV, but 2x10G is too much at the moment even without ipfw.
 
 
 Dont some of the modern 10G adapters support filtering in the card
 itself ?  eg cxgbe.
We're using Intel 8259X, it supports hardware filtering (flow director
and some other specific things like DCB) but:
1) Flow director is currently not supported (on FreeBSD)
2) There is no ipfw opcode compiler (however it seems that it's not too
hard to write one)..
3) If ruleset is more or less optimized firewall is not the main CPU
consumer.

 
   ---Mike
 
 
 




signature.asc
Description: OpenPGP digital signature


IPFW eXtended tables [Was: Re: IPFW tables, dummynet and IPv6]

2011-12-25 Thread Alexander V. Chernikov
Hello everyone.

Final patch version now uses single IP_FW3 socket option.
Together with other changes this makes me think such changes should be
reviewed by a wider number of people. If there are no
objections/comments I plan to commit this on tuesday.

Changes:
* Tables (actually, radix trees) are now created/freed on demand.
* Tables can be of different types (CIDR and interfaces are supported at
the moment)
* Each tables has 2 pointers (basic and eXtended tree) which are
initialized independently permitting both IPv4/IPv6 address to be
specified in the same table without performance loss
* Every new opcode uses IP_FW3 socket option

This change does not break ABI, old ipfw(8) binary can configure IPv4
addresses on CIDR-type tables and flush every table.
Index: sbin/ipfw/ipfw2.c
===
--- sbin/ipfw/ipfw2.c   (revision 228874)
+++ sbin/ipfw/ipfw2.c   (working copy)
@@ -42,6 +42,8 @@
 #include timeconv.h  /* _long_to_time */
 #include unistd.h
 #include fcntl.h
+#include sys/param.h /* MIN */
+#include stddef.h/* offsetof */
 
 #include net/ethernet.h
 #include net/if.h/* only IFNAMSIZ */
@@ -57,6 +59,12 @@ struct cmdline_opts co;  /* global options */
 
 int resvd_set_number = RESVD_SET;
 
+int ipfw_socket = -1;
+
+#ifndef s6_addr32
+#define s6_addr32 __u6_addr.__u6_addr32
+#endif
+
 #define GET_UINT_ARG(arg, min, max, tok, s_x) do { \
if (!av[0]) \
errx(EX_USAGE, %s: missing argument, match_value(s_x, tok)); \
@@ -370,33 +378,65 @@ safe_realloc(void *ptr, size_t size)
 int
 do_cmd(int optname, void *optval, uintptr_t optlen)
 {
-   static int s = -1;  /* the socket */
int i;
 
if (co.test_only)
return 0;
 
-   if (s == -1)
-   s = socket(AF_INET, SOCK_RAW, IPPROTO_RAW);
-   if (s  0)
+   if (ipfw_socket == -1)
+   ipfw_socket = socket(AF_INET, SOCK_RAW, IPPROTO_RAW);
+   if (ipfw_socket  0)
err(EX_UNAVAILABLE, socket);
 
if (optname == IP_FW_GET || optname == IP_DUMMYNET_GET ||
-   optname == IP_FW_ADD || optname == IP_FW_TABLE_LIST ||
-   optname == IP_FW_TABLE_GETSIZE ||
+   optname == IP_FW_ADD || optname == IP_FW3 ||
optname == IP_FW_NAT_GET_CONFIG ||
optname  0 ||
optname == IP_FW_NAT_GET_LOG) {
if (optname  0)
optname = -optname;
-   i = getsockopt(s, IPPROTO_IP, optname, optval,
+   i = getsockopt(ipfw_socket, IPPROTO_IP, optname, optval,
(socklen_t *)optlen);
} else {
-   i = setsockopt(s, IPPROTO_IP, optname, optval, optlen);
+   i = setsockopt(ipfw_socket, IPPROTO_IP, optname, optval, 
optlen);
}
return i;
 }
 
+/*
+ * do_setcmd3 - pass ipfw control cmd to kernel
+ * @optname: option name
+ * @optval: pointer to option data
+ * @optlen: option length
+ *
+ * Function encapsulates option value in IP_FW3 socket option
+ * and calls setsockopt().
+ * Function returns 0 on success or -1 overwise.
+ */
+int
+do_setcmd3(int optname, void *optval, socklen_t optlen)
+{
+   socklen_t len;
+   ip_fw3_opheader *op3;
+
+   if (co.test_only)
+   return (0);
+
+   if (ipfw_socket == -1)
+   ipfw_socket = socket(AF_INET, SOCK_RAW, IPPROTO_RAW);
+   if (ipfw_socket  0)
+   err(EX_UNAVAILABLE, socket);
+
+   len = sizeof(ip_fw3_opheader) + optlen;
+   op3 = alloca(len);
+   /* Zero reserved fields */
+   memset(op3, 0, sizeof(ip_fw3_opheader));
+   memcpy(op3 + 1, optval, optlen);
+   op3-opcode = optname;
+
+   return setsockopt(ipfw_socket, IPPROTO_IP, IP_FW3, op3, len);
+}
+
 /**
  * match_token takes a table and a string, returns the value associated
  * with the string (-1 in case of failure).
@@ -3854,7 +3894,7 @@ ipfw_flush(int force)
 }
 
 
-static void table_list(ipfw_table_entry ent, int need_header);
+static void table_list(uint16_t num, int need_header);
 
 /*
  * This one handles all table-related commands
@@ -3866,12 +3906,12 @@ ipfw_flush(int force)
 void
 ipfw_table_handler(int ac, char *av[])
 {
-   ipfw_table_entry ent;
+   ipfw_table_xentry xent;
int do_add;
int is_all;
size_t len;
char *p;
-   uint32_t a;
+   uint32_t a, type, mask, addrlen;
uint32_t tables_max;
 
len = sizeof(tables_max);
@@ -3886,18 +3926,20 @@ ipfw_table_handler(int ac, char *av[])
 #endif
}
 
+   memset(xent, 0, sizeof(xent));
+
ac--; av++;
if (ac  isdigit(**av)) {
-   ent.tbl = atoi(*av);
+   xent.tbl = atoi(*av);
is_all = 0;
ac--; av++;
} else if (ac  _substrcmp(*av, all) == 0) {
-   ent.tbl = 0;
+

Re: IPFW tables, dummynet and IPv6

2011-12-18 Thread Alexander V. Chernikov
Pawel Tyll wrote:
 Hi lists,
 
 Are there any plans to implement IPv6 tables in ipfw? It would seem
 that our gov. may want to force us into IPv6 in 6 months ;)
I've got working implementation for IPv4+IPv6 and interface tables:

15:56 [0] zfsbase# /usr/obj/usr/src/sbin/ipfw/ipfw table 2 list
1.2.3.4/30 0
2a02:978::/64 0


15:16 [0] zfsbase# /usr/obj/usr/src/sbin/ipfw/ipfw table 4 list
em4/em4 2
vlan144/vlan144 1
vlan145/vlan145 11000
vlan146/vlan146 12000


I plan to commit it today/tomorrow.
8.2-S diff will be available, too


 
 Cheers.
 
 
 ___
 freebsd-...@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-net
 To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
 




signature.asc
Description: OpenPGP digital signature


Re: Limit src address may not work well:

2011-12-03 Thread Alexander V. Chernikov
Blog Tieng Viet wrote:
 Dear all, 
 
 I am using IPFW in FreeBSD 7.3-RELEASE.
 I have some problems as following:
 
 Limit src address may not work well:
 
 For example, I want to limit google robot not over 1 connection establishment:
 
 ${fwcmd} add 5625 pass tcp from 66.249.0.0/16 to me 80 limit src-addr 1
 
 But I saw there are about 6 ESTABLISMENT of this address in the results of 
 netstat -n
 
 Is it my wrong, please give me an advice.

Do you have some rule before 5625 consuming all TCP established traffic,
for example?

You need to get ALL traffic from '66.249.0.0/16 to me 80' to match this
exact rule.



 
 Best regards.
 
 
 --- On Thu, 11/3/11, Tim Gustafson t...@soe.ucsc.edu wrote:
 
 From: Tim Gustafson t...@soe.ucsc.edu
 Subject: Re: IPFW Problems
 To: Michael Sierchio ku...@tenebras.com
 Cc: freebsd-ipfw@freebsd.org
 Date: Thursday, November 3, 2011, 1:56 AM
 You may want to tweak the sysctl
 items that control the lifespan
 of dynamic rules.

 sysctl net.inet.ip.fw

 in particular, the default value of
 net.inet.ip.fw.dyn_ack_lifetime
 is probably way too long for your purposes.
 Here's what I have right now:

 root@bsd-02: sysctl net.inet.ip.fw
 net.inet.ip.fw.static_count: 48
 net.inet.ip.fw.default_to_accept: 0
 net.inet.ip.fw.tables_max: 128
 net.inet.ip.fw.default_rule: 65535
 net.inet.ip.fw.verbose_limit: 0
 net.inet.ip.fw.verbose: 0
 net.inet.ip.fw.autoinc_step: 100
 net.inet.ip.fw.one_pass: 1
 net.inet.ip.fw.enable: 1
 net.inet.ip.fw.dyn_keepalive: 1
 net.inet.ip.fw.dyn_short_lifetime: 5
 net.inet.ip.fw.dyn_udp_lifetime: 10
 net.inet.ip.fw.dyn_rst_lifetime: 1
 net.inet.ip.fw.dyn_fin_lifetime: 1
 net.inet.ip.fw.dyn_syn_lifetime: 20
 net.inet.ip.fw.dyn_ack_lifetime: 300
 net.inet.ip.fw.dyn_max: 32768
 net.inet.ip.fw.dyn_count: 805
 net.inet.ip.fw.curr_dyn_buckets: 256
 net.inet.ip.fw.dyn_buckets: 256

 I'm assuming that's in seconds.  Is 300 seconds too
 long?  It seems like the dynamic rules are hanging
 around for hours or days, and I think the timeout is getting
 reset by the fact that the system is constantly sending out
 ACK packets to clients that aren't acknowledging them.

 -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
 Tim Gustafson   


 t...@soe.ucsc.edu
 Baskin School of Engineering   

  
831-459-5354
 UC Santa Cruz   

  Baskin
 Engineering 317B
 -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
 ___
 freebsd-ipfw@freebsd.org
 mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-ipfw
 To unsubscribe, send any mail to freebsd-ipfw-unsubscr...@freebsd.org

 ___
 freebsd-ipfw@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-ipfw
 To unsubscribe, send any mail to freebsd-ipfw-unsubscr...@freebsd.org
 

___
freebsd-ipfw@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-ipfw
To unsubscribe, send any mail to freebsd-ipfw-unsubscr...@freebsd.org


Re: ipfw nat drops icmp packets from localhost

2011-10-06 Thread Alexander V. Chernikov
On 06.10.2011 14:42, Oleg Strizhak wrote:
 Hello, Andrey V. Elsukov!
 
 You wrote on 06.10.2011 at 13:38:
 
 On 06.10.2011 12:29, Oleg Strizhak wrote:
 After an investigation I've found out a very strange situation - it
 seems to me, that ipfw nat drops
 some (type 11?) icmp reply packets, whose udp request packets it
 hasn't rewritten/seen before, e.g:

 So, I wonder whether someone else has seen the same case under the
 similar circumstances? Isn't it a
 bug within ipfw nat module and is there any work-around/patch for
 that? I've surely googled, but in
 vain =( The only thing, that seems alike to my problem, is
 http://www.freebsd.org/cgi/query-pr.cgi?pr=129093, but the patch for
 8 branch didn't cure anything =(

 Can you describe how you did apply and test this patch?
 
 in a usual way =) Unfortunately, copy-pasted from the mentioned above
 page patch couldn't be applied w/ error:

svn diff -c 223835 svn://svn.freebsd.org/base/stable/8  ~/r223835.diff
Can you try the patch attached (just to be sure) ?

This is exact situation from this (and some related PRs) and this
revision definitely fixes it.

Btw, what is the value of net.inet.ip.fw.one_pass sysctl ?
Are you sure that ipfw is the single enabled firewall on this machine ?
Are you sure that system is using new kernel ?


 
 $ patch  ~/ip_fw_nat.patch
 Hmm...  Looks like a unified diff to me...
 The text leading up to this was:
 --
 |--- stable/8/sys/netinet/ipfw/ip_fw_nat.c  Thu Jul 7 08:33:58
 2011 (r223834)
 |+++ stable/8/sys/netinet/ipfw/ip_fw_nat.c  Thu Jul 7 09:29:11
 2011 (r223835)
 --
 Patching file ip_fw_nat.c using Plan A...
 patch:  malformed patch at line 4: else
 
 the same results were obtained with combinations of -p5 -l and tail +2
 ~/ip_fw_nat.patch options  commands
 Finally, I modified the patch (which applies w/o a word =) a little bit
 w/o any difference to the original one:
 
  $ /usr/bin/diff -wBbu3 ~/ip_fw_nat.patch ~/ip_fw_nat.patch.my
 --- /root/ip_fw_nat.patch   2011-10-04 14:08:32.0 +0400
 +++ /root/ip_fw_nat.patch.my2011-10-04 14:29:53.0 +0400
 @@ -1,5 +1,5 @@
  stable/8/sys/netinet/ipfw/ip_fw_nat.c  Thu Jul 7 08:33:58
 2011 (r223834)
 -+++ stable/8/sys/netinet/ipfw/ip_fw_nat.c  Thu Jul 7 09:29:11
 2011 (r223835)
 +--- ip_fw_nat.c.orig   2010-12-21 20:09:25.0 +0300
  ip_fw_nat.c2011-10-04 14:27:02.0 +0400
  @@ -263,17 +263,27 @@
  else
  retval = LibAliasOut(t-lib, c,
 
 then I recompiled the kernel, rebooted server and.. all is just the same =(
 
 WBR,
 Oleg
 

Index: sys/netinet/ipfw/ip_fw_nat.c
===
--- sys/netinet/ipfw/ip_fw_nat.c(revision 223834)
+++ sys/netinet/ipfw/ip_fw_nat.c(revision 223835)
@@ -263,17 +263,27 @@
else
retval = LibAliasOut(t-lib, c,
mcl-m_len + M_TRAILINGSPACE(mcl));
-   if (retval == PKT_ALIAS_RESPOND) {
-   m-m_flags |= M_SKIP_FIREWALL;
-   retval = PKT_ALIAS_OK;
-   }
-   if (retval != PKT_ALIAS_OK 
-   retval != PKT_ALIAS_FOUND_HEADER_FRAGMENT) {
+
+   /*
+* We drop packet when:
+* 1. libalias returns PKT_ALIAS_ERROR;
+* 2. For incoming packets:
+*  a) for unresolved fragments;
+*  b) libalias returns PKT_ALIAS_IGNORED and
+*  PKT_ALIAS_DENY_INCOMING flag is set.
+*/
+   if (retval == PKT_ALIAS_ERROR ||
+   (args-oif == NULL  (retval == PKT_ALIAS_UNRESOLVED_FRAGMENT ||
+   (retval == PKT_ALIAS_IGNORED 
+   (t-lib-packetAliasMode  PKT_ALIAS_DENY_INCOMING) != 0 {
/* XXX - should i add some logging? */
m_free(mcl);
args-m = NULL;
return (IP_FW_DENY);
}
+
+   if (retval == PKT_ALIAS_RESPOND)
+   m-m_flags |= M_SKIP_FIREWALL;
mcl-m_pkthdr.len = mcl-m_len = ntohs(ip-ip_len);
 
/*

Property changes on: sys/contrib/pf
___
Modified: svn:mergeinfo
   Merged /head/sys/contrib/pf:r222806


Property changes on: sys/contrib/dev/acpica
___
Modified: svn:mergeinfo
   Merged /head/sys/contrib/dev/acpica:r222806


Property changes on: sys/cddl/contrib/opensolaris
___
Modified: svn:mergeinfo
   Merged /head/sys/cddl/contrib/opensolaris:r222806


Property changes on: sys/amd64/include/xen
___
Modified: svn:mergeinfo
   Merged /head/sys/amd64/include/xen:r222806


Property changes on: sys
___
Modified: svn:mergeinfo
   Merged /head/sys:r222806

___

Re: kern/122109: [ipfw] ipfw nat traceroute problem

2011-06-03 Thread Alexander V. Chernikov
The following reply was made to PR kern/122109; it has been noted by GNATS.

From: Alexander V. Chernikov melif...@ipfw.ru
To: bug-follo...@freebsd.org, m.dyadche...@211.ru, a...@freebsd.org
Cc:  
Subject: Re: kern/122109: [ipfw] ipfw nat traceroute problem
Date: Fri, 03 Jun 2011 10:08:13 +0400

 Problem is actually a bit deeper.
 
 Before libalias-based kernel nat appears natd uses PKT_ALIAS_IGNORE
 retrun code to drop packets iff PKT_ALIAS_DENY_INCOMING flag is set:
 
status = LibAliasIn (mla, buf, IP_MAXPACKET);
 if (status == PKT_ALIAS_IGNORED 
 mip-dropIgnoredIncoming) {
 
 if (verbose)
 printf ( dropped.\n);
 
 
 Current ipfw nat (and ng_nat) implementation simply drops every packet
 with PKT_ALIAS_IGNORE return code:
 
if (retval != PKT_ALIAS_OK 
retval != PKT_ALIAS_FOUND_HEADER_FRAGMENT) {
 /* XXX - should i add some logging? */
 m_free(mcl);
 
 Most of PKT_ALIAS_IGNORED are returned in case of no state is found (the
 rest are some (possibly) very rare unknown errors/handlers error).
 
 Libalias automatically create new state for every packet not found in
 aliasing database if it reasonable to do so (TCP/UDP packets is
 definitely reasonable since they represent logical sessions, icmp
 req/reply is reasonable too, etc..). On the opposite, there is no reason
 for creating state for packets signaling some existing session errors
 (icmp unreach, etc..) since such packets are rare/unidirectional and no
 reply is needed.
 
 The only 2 places states are not created (not mentioning
 PKT_ALIAS_PROXY_ONLY and PKT_ALIAS_DENY_INCOMING modes) are
 IcmpAliasIn2()|IcmpAliasOut2() functions.
 
 Those function dispatches various ICMP notification and tries to map
 those notification to existing states using original packet header
 within ICMP message. If such session is not found (PR case, since
 usually locally-originated packets are not passed to libalias and no
 replies are transmitted due to traceroute specific) return code is set
 to PKT_ALIAS_IGNORED.
 
 As a result: restoring original behavior should not break anything.
 
 This patch seems to fix the problem:
 
 Index: ip_fw_nat.c
 ===
 --- ip_fw_nat.c (revision 221263)
 +++ ip_fw_nat.c (working copy)
 @@ -267,8 +267,9 @@
 m-m_flags |= M_SKIP_FIREWALL;
 retval = PKT_ALIAS_OK;
 }
 -   if (retval != PKT_ALIAS_OK 
 -   retval != PKT_ALIAS_FOUND_HEADER_FRAGMENT) {
 +   if (retval == PKT_ALIAS_ERROR || retval ==
 PKT_ALIAS_UNRESOLVED_FRAGMENT ||
 +   (retval == PKT_ALIAS_IGNORED 
 +(t-lib-packetAliasMode  PKT_ALIAS_DENY_INCOMING))) {
 /* XXX - should i add some logging? */
 m_free(mcl);
 args-m = NULL;
 
 
 Something similar should be applied to ng_nat.c
___
freebsd-ipfw@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-ipfw
To unsubscribe, send any mail to freebsd-ipfw-unsubscr...@freebsd.org


Re: kern/157379: [ipfw] mtr does not work if I use ipfw nat

2011-05-30 Thread Alexander V. Chernikov
The following reply was made to PR kern/157379; it has been noted by GNATS.

From: Alexander V. Chernikov melif...@ipfw.ru
To: bug-follo...@freebsd.org, kes-...@yandex.ru
Cc:  
Subject: Re: kern/157379: [ipfw] mtr does not work if I use ipfw nat
Date: Mon, 30 May 2011 15:23:34 +0400

 This seems to be a duplicate of kern/122109
___
freebsd-ipfw@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-ipfw
To unsubscribe, send any mail to freebsd-ipfw-unsubscr...@freebsd.org


Re: kern/122109: [ipfw] ipfw nat traceroute problem

2010-09-21 Thread Alexander V. Chernikov
The following reply was made to PR kern/122109; it has been noted by GNATS.

From: Alexander V. Chernikov melif...@ipfw.ru
To: bug-follo...@freebsd.org, m.dyadche...@211.ru
Cc:  
Subject: Re: kern/122109: [ipfw] ipfw nat traceroute problem
Date: Wed, 22 Sep 2010 01:24:40 +0400

 Problem can be fixed with a small patch:
 
 --- /usr/src/sys/netinet/libalias/alias.c.orig  2010-09-22
 01:07:19.0 +0400
 +++ /usr/src/sys/netinet/libalias/alias.c   2010-09-22
 01:11:11.0 +0400
 @@ -432,7 +432,7 @@
 }
 return (PKT_ALIAS_OK);
 }
 -   return (PKT_ALIAS_IGNORED);
 +   return (PKT_ALIAS_OK);
  }
 
 
 IcmpAliasIn2() doesn't create state for incoming packets (like
 IcmpAliasIn1() does)
 
 IcmpAliasIn2() is called only in case of
 ICMP_UNREACH|ICMP_SOURCEQUENCH|ICMP_TIMXCEED|ICMP_PARAMPROB
 
 If incoming icmp packet of given type is not found in internal state
 table we can just pass it to the host system (back to ipfw or netgraph
 hook, really) without even creating state
___
freebsd-ipfw@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-ipfw
To unsubscribe, send any mail to freebsd-ipfw-unsubscr...@freebsd.org