On 28/12/13 13:42, Sebastian Moeller wrote:
Hi Fred,
On Dec 28, 2013, at 12:09 , Fred Stratton <[email protected]> wrote:
IThe UK consensus fudge factor has always been 85 per cent of the rate
achieved, not 95 or 99 per cent.
I know that the recommendations have been lower in the past; I think this is partly because
before Jesper Brouer's and Russels Stuart's work to properly account for ATM
"quantization" people typically had to deal with a ~10% rate tax for the 5byte per cell
overhead (48 byte payload in 53 byte cells 90.57% useable rate) plus an additional 5% to
stochastically account for the padding of the last cell and the per packet overhead both of which
affect the effective good put way more for small than large packets, so the 85% never worked well
for all packet sizes. My hypothesis now is since we can and do properly account for these effects
of ATM framing we can afford to start with a fudge factor of 90% or even 95% percent. As far as I
know the recommended fudge factors are never ever explained by more than "this works
empirically"...
The fudge factors are totally empirical. IF you are proposing a more
formal approach, I shall try a 90 per cent fudge factor, although
'current rate' varies here.
Devices express 2 values: the sync rate - or 'maximum rate attainable' - and
the dynamic value of 'current rate'.
The actual data rate is the relevant information for shaping, often DSL modems
report the link capacity as "maximum rate attainable" or some such, while the
actual bandwidth is limited to a rate below what the line would support by contract
(often this bandwidth reduction is performed on the PPPoE link to the BRAS).
As the sync rate is fairly stable for any given installation - ADSL or Fibre -
this could be used as a starting value. decremented by the traditional 15 per
cent of 'overhead'. and the 85 per cent fudge factor applied to that.
I would like to propose to use the "current rate" as starting point, as
'maximum rate attainable' >= 'current rate'.
'current rate' is still a sync rate, and so is conventionally viewed as
15 per cent above the unmeasurable actual rate. As you are proposing a
new approach, I shall take 90 per cent of 'current rate' as a starting
point.
No one in the UK uses SRA currently. One small ISP used to. The ISP I
currently use has Dynamic Line Management, which changes target SNR
constantly. The DSLAM is made by Infineon.
Fibre - FTTC - connections can suffer quite large download speed fluctuations
over the 200 - 500 metre link to the MSAN. This phenomenon is not confined to
ADSL links.
On the actual xDSL link? As far as I know no telco actually uses SRA
(seamless rate adaptation or so) so the current link speed will only get lower
not higher, so I would expect a relative stable current rate (it might take a
while, a few days to actually slowly degrade to the highest link speed
supported under all conditions, but I hope you still get my point)
I understand the point, but do not think it is the case, from data I
have seen, but cannot find now, unfortunately.
An alternative speed test is something like this
http://download.bethere.co.uk/downloadMeter.html
which, as Be has been bought by Sky, may not exist after the end of April 2014.
But, if we recommend to run speed tests we really need to advise our
users to start several concurrent up- and downloads to independent servers to
actually measure the bandwidth of our bottleneck link; often a single server
connection will not saturate a link (I seem to recall that with TCP it is
guaranteed to only reach 75% or so averaged over time, is that correct?).
But I think this is not the proper way to set the bandwidth for the
shaper, because upstream of our link to the ISP we have no guaranteed bandwidth
at all and just can hope the ISP is oing the right thing AQM-wise.
I quote the Be site as an alternative to a java based approach. I would
be very happy to see your suggestion adopted.
• [What is the proper description here?] If you use PPPoE (but not over
ADSL/DSL link), PPPoATM, or bridging that isn’t Ethernet, you should choose
[what?] and set the Per-packet Overhead to [what?]
For a PPPoA service, the PPPoA link is treated as PPPoE on the second device,
here running ceroWRT.
This still means you should specify the PPPoA overhead, not PPPoE.
I shall try the PPPoA overhead.
The packet overhead values are written in the dubious man page for tc_stab.
The only real flaw in that man page, as far as I know, is the fact that
it indicates that the kernel will account for the 18byte ethernet header
automatically, while the kernel does no such thing (which I hope to change).
It mentions link layer types as 'atm' ethernet' and 'adsl'. There is no
reference anywhere to the last. I do not see its relevance.
Sebastian has a potential alternative method of formal calculation.
So, I have no formal calculation method available, but an empirical way
of detecting ATM quantization as well as measuring the per packet overhead of
an ATM link.
The idea is to measure the RTT of ICMP packets of increasing length and then displaying the
distribution of RTTs by ICMP packet length, on an ATM carrier we expect to see a step function with
steps 48 bytes apart. For non-ATM carrier we expect to rather see a smooth ramp. By comparing the
residuals of a linear fit of the data with the residuals of the best step function fit to the data.
The fit with the lower residuals "wins". Attached you will find an example of this
approach, ping data in red (median of NNN repetitions for each ICMP packet size), linear fit in
blue, and best staircase fit in green. You notice that data starts somewhere in a 48 byte ATM cell.
Since the ATM encapsulation overhead is maximally 44 bytes and we know the IP and ICMP overhead of
the ping probe we can calculate the overhead preceding the IP header, which is what needs to be put
in the overhead field in the GUI. (Note where the green line intersect the y-axis at 0 bytes packet
size? this is where the IP header starts, the "missing" part of this ATM cell is the
overhead).
You are curve fitting. This is calculation.
Believe it or not, this methods works reasonable well (I tested
successfully with one Bridged, LLC/SNAP RFC-1483/2684 connection (overhead 32
bytes), and several PPPOE, LLC, (overhead 40) connections (from ADSL1 @
3008/512 to ADSL2+ @ 16402/2558)). But it takes relative long time to measure
the ping train especially at the higher rates… and it requires ping time stamps
with decent resolution (which rules out windows) and my naive data acquisition
scripts creates really large raw data files. I guess I should post the code
somewhere so others can test and improve it.
Fred I would be delighted to get a data set from your connection, to
test a known different encapsulation.
I shall try this. If successful, I shall initially pass you the raw
data. I have not used MatLab since the 1980s.
TYPICAL OVERHEADS
The following values are typical for different adsl scenarios (based on
[1] and [2]):
LLC based:
PPPoA - 14 (PPP - 2, ATM - 12)
PPPoE - 40+ (PPPoE - 8, ATM - 18, ethernet 14, possibly FCS -
4+padding)
Bridged - 32 (ATM - 18, ethernet 14, possibly FCS - 4+padding)
IPoA - 16 (ATM - 16)
VC Mux based:
PPPoA - 10 (PPP - 2, ATM - 8)
PPPoE - 32+ (PPPoE - 8, ATM - 10, ethernet 14, possibly FCS -
4+padding)
Bridged - 24+ (ATM - 10, ethernet 14, possibly FCS - 4+padding)
IPoA - 8 (ATM - 8)
For VC Mux based PPPoA, I am currently using an overhead of 18 for the PPPoE
setting in ceroWRT.
Yeah we could put this list into the wiki, but how shall a typical user
figure out which encapsulation is used? And good luck in figuring out whether
the frame check sequence (FCS) is included or not…
BTW 18, I predict that if PPPoE is only used between cerowrt and the "modem' or
gateway your effective overhead should be 10 bytes; I would love if you could run
the following against your link at night (also attached
):
#! /bin/bash
# TODO use seq or bash to generate a list of the requested sizes (to allow for
non-equidistantly spaced sizes)
#.
TECH=ADSL2 # just to give some meaning to the ping trace file name
# finding a proper target IP is somewhat of an art, just traceroute a remote
site.
# and find the nearest host reliably responding to pings showing the smallet
variation of pingtimes
TARGET=${1} # the IP against which to run the ICMP pings
DATESTR=`date +%Y%m%d_%H%M%S`<-># to allow multiple sequential records
LOG=ping_sweep_${TECH}_${DATESTR}.txt
# by default non-root ping will only end one packet per second, so work around
that by calling ping independently for each package
# empirically figure out the shortest period still giving the standard ping
time (to avoid being slow-pathed by our target)
PINGPERIOD=0.01><------># in seconds
PINGSPERSIZE=10000
# Start, needed to find the per packet overhead dependent on the ATM
encapsulation
# to reiably show ATM quantization one would like to see at least two steps, so cover
a range > 2 ATM cells (so > 96 bytes)
SWEEPMINSIZE=16><------># 64bit systems seem to require 16 bytes of payload to
include a timestamp...
SWEEPMAXSIZE=116
n_SWEEPS=`expr ${SWEEPMAXSIZE} - ${SWEEPMINSIZE}`
i_sweep=0
i_size=0
echo "Running ICMP RTT measurement against: ${TARGET}"
while [ ${i_sweep} -lt ${PINGSPERSIZE} ]
do
(( i_sweep++ ))
echo "Current iteration: ${i_sweep}"
# now loop from sweepmin to sweepmax
i_size=${SWEEPMINSIZE}
while [ ${i_size} -le ${SWEEPMAXSIZE} ]
do
echo "${i_sweep}. repetition of ping size ${i_size}"
ping -c 1 -s ${i_size} ${TARGET} >> ${LOG} &\
(( i_size++ ))
# we need a sleep binary that allows non integer times (GNU sleep is
fine as is sleep of macosx 10.8.4)
sleep ${PINGPERIOD}
done
done
echo "Done... ($0)"
This will try to run 10000 repetitions for ICMP packet sizes from 16 to 116
bytes running (10000 * 101 * 0.01 / 60 =) 168 minutes, but you should be able
to stop it with ctrl c if you are not patience enough, with your link I would
estimate that 3000 should be plenty, but if you could run it over night that
would be great and then ~3 hours should not matter much.
And then run the following attached code in octave or matlab
. Invoce with
"tc_stab_parameter_guide_03('path/to/the/data/file/you/created/name_of_said_file')".
The parser will run on the first invocation and is reallr really slow, but further
invocations should be faster. If issues arise, let me know, I am happy to help.
Were I to use a single directly connected gateway, I would input a suitable
value for PPPoA in that openWRT firmware.
I think you should do that right now.
The firmware has not yet been released.
In theory, I might need to use a negative value, bmt the current kernel does
not support that.
If you use tc_stab, negative overheads are fully supported, only
htb_private has overhead defined as unsigned integer and hence does not allow
negative values.
Jesper Brouer posted about this. I thought he was referring to tc_stab.
I have used many different arbitrary values for overhead. All appear to have
little effect.
So the issue here is that only at small packet sizes does the overhead and last
cell padding eat a disproportionate amount of your bandwidth (64 byte packet plus 44 byte
overhead plus 47 byte worst case cell padding: 100* (44+47+64)/64 = 242% effective packet
size to what the shaper estimated ), at typical packet sizes the max error (44 bytes
missing overhead and potentially misjudged cell padding of 47 bytes adds up to a
theoretical 100*(44+47+1500)/1500 = 106% effective packet size to what the shaper
estimated). It is obvious that at 1500 byte packets the whole ATM issue can be easily
dismissed with just reducing the link rate by ~10% for the 48 in 53 framing and an
additional ~6% for overhead and cell padding. But once you mix smaller packets in your
traffic for say VoIP, the effective wire size misjudgment will kill your ability to
control the queueing. Note that the common wisdom of shape down to 85% might be fem the
~15% ATM "tax" on 1500 byte traffic size...
As I understand it, the current recommendation is to use tc_stab in preference
to htb_private. I do not know the basis for this value judgement.
In short: tc_stab allows negative overheads, tc_stab works with HTB,
TBF, HFSC while htb_private only works with HTB. Currently htb_private has two
advantages: it will estimate the per packet overhead correctly of GSO (generic
segmentation offload) is enabled and it will produce exact ATM link layer
estimates for all possible packet sizes. In practice almost everyone uses an
MTU of 1500 or less for their internet access making both htb_private
advantages effectively moot. (Plus if no one beats me to it I intend to address
both theoretical short coming of tc_stab next year).
Best Regards
Sebastian
On 28/12/13 10:01, Sebastian Moeller wrote:
Hi Rich,
great! A few comments:
Basic Settings:
[Is 95% the right fudge factor?] I think that ideally, if we get can precisely
measure the useable link rate even 99% of that should work out well, to keep
the queue in our device. I assume that due to the difficulties in measuring and
accounting for the link properties as link layer and overhead people typically
rely on setting the shaped rate a bit lower than required to
stochastically/empirically account for the link properties. I predict that if
we get a correct description of the link properties to the shaper we should be
fine with 95% shaping. Note though, it is not trivial on an adel link to get
the actually useable bit rate from the modem so 95% of what can be deduced from
the modem or the ISP's invoice might be a decent proxy…
[Do we have a recommendation for an easy way to tell if it's working? Perhaps a
link to a new Quick Test for Bufferbloat page. ] The linked page looks like a
decent probe for buffer bloat.
Basic Settings - the details...
CeroWrt is designed to manage the queues of packets waiting to be sent across
the slowest (bottleneck) link, which is usually your connection to the Internet.
I think we can only actually control the first link to the ISP, which
often happens to be the bottleneck. At a typical DSLAM (xDSL head end station)
the cumulative sold bandwidth to the customers is larger than the back bone
connection (which is called over-subscription and is almost guaranteed to be
the case in every DSLAM) which typically is not a problem, as typically people
do not use their internet that much. My point being we can not really control
congestion in the DSLAM's uplink (as we have no idea what the reserved rate per
customer is in the worst case, if there is any).
CeroWrt can automatically adapt to network conditions to improve the
delay/latency of data without any settings.
Does this describe the default fq_codels on each interface (except
fib?)?
However, it can do a better job if it knows more about the actual link speeds
available. You can adjust this setting by entering link speeds that are a few
percent below the actual speeds.
Note: it can be difficult to get an accurate measurement of the link speeds.
The speed advertised by your provider is a starting point, but your experience
often won't meet their published specs. You can also use a speed test program
or web site like
http://speedtest.net
to estimate actual operating speeds.
While this approach is commonly recommended on the internet, I do not
believe that it is that useful. Between a user and the speediest site there are a
number of potential congestion points that can affect (reduce) the throughput,
like bad peering. Now that said the sppedtets will report something <= the
actual link speed and hence be conservative (interactivity stays great at 90% of
link rate as well as 80% so underestimating the bandwidth within reason does not
affect the latency gains from traffic shaping it just sacrifices a bit more
bandwidth; and given the difficulty to actually measure the actually attainable
bandwidth might have been effectively a decent recommendation even though the
theory of it seems flawed)
Be sure to make your measurement when network is quiet, and others in your home
aren’t generating traffic.
This is great advise.
I would love to comment further, but after reloading
http://www.bufferbloat.net/projects/cerowrt/wiki/Setting_up_AQM_for_CeroWrt_310
just returns a blank page and I can not get back to the page as of yesterday
evening… I will have a look later to see whether the page resurfaces…
Best
Sebastian
On Dec 27, 2013, at 23:09 , Rich Brown
<[email protected]>
wrote:
You are a very good writer and I am on a tablet.
Thanks!
Ill take a pass at the wiki tomorrow.
The shaper does up and down was my first thought...
Everyone else… Don’t let Dave hog all the fun! Read the tech note and give
feedback!
Rich
On Dec 27, 2013 10:48 AM, "Rich Brown" <[email protected]>
wrote:
I updated the page to reflect the 3.10.24-8 build, and its new GUI pages.
http://www.bufferbloat.net/projects/cerowrt/wiki/Setting_up_AQM_for_CeroWrt_310
There are still lots of open questions. Comments, please.
Rich
_______________________________________________
Cerowrt-devel mailing list
[email protected]
https://lists.bufferbloat.net/listinfo/cerowrt-devel
_______________________________________________
Cerowrt-devel mailing list
[email protected]
https://lists.bufferbloat.net/listinfo/cerowrt-devel
_______________________________________________
Cerowrt-devel mailing list
[email protected]
https://lists.bufferbloat.net/listinfo/cerowrt-devel
_______________________________________________
Cerowrt-devel mailing list
[email protected]
https://lists.bufferbloat.net/listinfo/cerowrt-devel