Re: ipfw divert filter for IPv4 geo-blocking

Dr. Rolf Jansen Mon, 01 Aug 2016 04:17:12 -0700

> Am 01.08.2016 um 03:17 schrieb Julian Elischer <jul...@freebsd.org>:
> On 30/07/2016 10:17 PM, Dr. Rolf Jansen wrote:
>> I finished the work on CIDR conformity of the IP ranges tables generated by 
>> the tool geoip. The main constraint is that the start and end address of an 
>> IP block given by the delegation files MUST BE PRESERVED during the 
>> transformation to a set of CIDR records. This target is achieved by:
>> 
>>  1. Finding the largest common netmask boundary of the start address 
>> utilizing
>>     int(log2(addr_count)); then iteration like Euclid's algorithm in 
>> computing
>>     a GCD.
>> 
>>  2. Output the CIDR with the given start address and the masklen belonging
>>     to the found netmask.
>> 
>>  3. If the CIDR does not match the whole original IP range then set the start
>>     address of the next CIDR block to the next boundary of the common 
>> netmask,
>>     and loop over starting at 1. until the original range has been satisfied.
>> 
> check out the appletalk code I pointed out  to you.. I wrote that in 93 or so 
> but I remember sweating blood
> over it to get it right.


I read the description of the code and the following sentence made me 
suspicious that aa_dorangeroute() would guarantee the above mentioned main 
constraint  "start and end address of an IP block given by the delegation files 
MUST BE PRESERVED" can be matched. Start/end address are said to be anything 
(even undefined) but fixed in the description.

   ... 
   Split the range into two subranges such that the middle
   of the two ranges is the point where the highest bit of difference
   between the two addresses makes its transition.
   ...

I do not want this.

>> I carefully tested the algorithm and a table that I pipe by the new geoip 
>> tool into ipfw is 100 % identical to the output of the ipfw command 'table N 
>> list'.
> though that doesn't mean it is semantically identical to the original table 
> due to 'most specific rule wins" behaviour.
> 
> for example:
> if you type in ;
> 
> 1.2.3.0/24 -> A
> and
> 1.2.3.0/26 -> B
> then both rules will be listed the same as what you put in
> but if you wanted to get all rules that point to A, without having rules that 
> point to B, then you would have to export
> 1.2.3.64/26  -> A
> 1.2.3.128/25 -> A
>  (i.e. TWO rules)

This is definitely not the usage case. The origin of the data to be passed to 
ipfw tables are RIR delegation statistics files, which is guaranteed to be 
consolidated, namely resolved overlaps and joined adjacencies, long before any 
tables for ipfw are generated. Each range entry got a well defined, i.e. fixed, 
i.e non-variable starting address, and anything that changes the starting 
address of the ranges renders the table useless. Every entry got a well defined 
range length, and that one also must not be changed, or the table would be 
useless as well.

In addition, we are talking about automatic generation of thousands of entries, 
and I never ever won't rely on something like 'most specific rule wins' 
behaviour, I want the behaviour as explicit as possible, and for this reason I 
am happy with 'INPUT is 100 % identical to the OUTPUT'.

> you could also export
> 1.2.3.0/24 -> A
> 1.2.3.0/26 -> 0  (think of it as an "EXCEPT for these" rule)
> 
> which is ALSO two rules but you would need to be sure that the receiver knows 
> what to do with them.

This is simply a ridiculous example in the given respect, this sounds like you 
are suggesting fuzzying the input data in order to bring ipfw to its limits. 
This makes life less boring, doesn't it? No thanks.

>> It is worth to note, that already the original RIR delegation files contain 
>> 457 non CIDR conforming IPv4 ranges in a total of 165815 original records. I 
>> guess that this number will increase in the future because the RIR's ran 
>> empty on new IPv4 ranges and are urged to subdivide returned old ranges for 
>> new delegations. The above algorithm is ready for this.
>> 
>> Generally, CIDR conforming tables are more than twice as large as optimized 
>> (joined adjacencies) IP range tables. All said changes have been pushed to 
>> GitHup already.
>> 
> Unfortunately there is no way to specify (using cidr notation) a.b.1.x AND 
> a.b.2.x without including a.b.[03].x.
> 
> HOWEVER
> if you specified the FULL table you could use the "except" feature of routing 
> table behaviour where
> a.b.0.x/22  -> A
> a.b.0.x/24  -> B
> a.b.3.x/24  -> B
> gives you the same thing because of the 'most specific rule wins" nature of 
> routing table evaluation.
> I believe this is the case in the tables you imported.
> the trick is to be able to take an "optimised" table such as that above and 
> produce, given a required subset, just the required part, while changing the 
> rules as needed on the fly to "de-optimise" them enough to maintain 
> correctness.

Again, this is not the usage case.

>> I am still a little bit amazed how ipfw come to accept incorrect CIDR ranges 
>> and arbitrarily moves the start/end addresses in order to achieve CIDR 
>> conformity, and that without any further notice, and that given that ipfw 
>> can be considered as being quite relevant to system security. Or, may I 
>> assume that ipfw knows always better than the user what should be allowed or 
>> denied. Otherwise, perhaps I am the only one ever who input incorrect CIDR 
>> ranges for processing by ipfw.
> 
> I answered this before but can't see the answer in my out box, plus I have 
> added info..
> 
> The ipfw code is derived from the routing code.  it is shorthand notation for 
> a.b.c.d [netmask e.f.g.h ]
> there is nothing that says that a.b.c.d need be the first address in the 
> range. (though some vendors may require that.)
> to quote wikipedia on the topic (yes, I know, not an authoritative source)
> 
> ==== quote ====
> The address may denote a single, distinct interface address or the beginning 
> address of an entire network. The maximum size of the network is given by the 
> number of addresses that are possible with the remaining, least-significant 
> bits below the prefix. The aggregation of these bits is often called the host 
> identifier.
> 
> For example:
> 
>       • 192.168.100.14/24 represents the IPv4 address 192.168.100.14 and its 
> associated routing prefix 192.168.100.0, or equivalently, its subnet mask 
> 255.255.255.0, which has 24 leading 1-bits.
> I use this all the time when parsing information that contains a hostname, 
> and I know the netmask width. It saves me from having to have complicated 
> shell code to pull apart the address and zero out the host bits of the 
> address.

I got it, anyway this is not an issue anymore for the new geoip table 
generation.

Best regards

Rolf

_______________________________________________
freebsd-ipfw@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-ipfw
To unsubscribe, send any mail to "freebsd-ipfw-unsubscr...@freebsd.org"

Re: ipfw divert filter for IPv4 geo-blocking

Reply via email to