Re: Lossy cogent p2p experiences?

2023-09-07 Thread Masataka Ohta

Saku Ytti wrote:


And you will be wrong. Packet arriving out of order, will be
considered previous packet lost by host, and host will signal need for
resend.


As I already quote the very old and fundamental paper on
the E2E argument:

End-To-End Arguments in System Design

https://groups.csail.mit.edu/ana/Publications/PubPDFs/End-to-End%20Arguments%20in%20System%20Design.pdf
: 3.4 Guaranteeing FIFO Message Delivery

and as is described in rfc2001,

   Since TCP does not know whether a duplicate ACK is caused by a lost
   ^^^
   segment or just a reordering of segments, it waits for a small number
   ^
   of duplicate ACKs to be received.  It is assumed that if there is
   just a reordering of the segments, there will be only one or two
   duplicate ACKs before the reordered segment is processed, which will
   then generate a new ACK.  If three or more duplicate ACKs are
 ^^^
   received in a row, it is a strong indication that a segment has been
   
   lost.
   -

in networking, it is well known that "Guaranteeing FIFO Message
Delivery" by the network is impossible because packets arriving
out of order without packet losses is inevitable and is not
uncommon.

As such, slight reordering is *NOT* interpreted as previous
packet loss.

The allowed amount of reordering depends on TCP implementations
and can be controlled by upgrading TCP.

    Masataka Ohta



Re: Lossy cogent p2p experiences?

2023-09-07 Thread Masataka Ohta

Tom Beecher wrote:


Well, not exactly the same thing. (But it's my mistake, I was referring to
L3 balancing, not L2 interface stuff.)


That should be a correct referring.


load-balance per-packet will cause massive reordering,


If buffering delay of ECM paths can not be controlled , yes.


because it's random
spray , caring about nothing except equal loading of the members.


Equal loading on point to point links between two routers by
(weighted) round robin means mostly same buffering delay, which
won't cause massive reordering.

Masataka Ohta



Re: Lossy cogent p2p experiences?

2023-09-06 Thread Masataka Ohta

Benny Lyne Amorsen wrote:


TCP looks quite different in 2023 than it did in 1998. It should handle
packet reordering quite gracefully;


Maybe and, even if it isn't, TCP may be modified. But that
is not my primary point.

ECMP, in general, means pathes consist of multiple routers
and links. The links have various bandwidth and other
traffic may be merged at multi access links or on routers.

Then, it is hopeless for the load balancing points to
control buffers of the routers in the pathes and delays
caused by buffers, which makes per-packet load balancing
hopeless.

However, as I wrote to Mark Tinka;

: If you have multiple parallel links over which many slow
: TCP connections are running, which should be your assumption,

with "multiple parallel links", which are single hop
pathes, it is possible for the load balancing point
to control amount of buffer occupancy of the links
and delays caused by the buffers almost same, which
should eliminate packet reordering within a flow,
especially when " many slow TCP connections are
running".

And, simple round robin should be good enough
for most of the cases (no lab testing at all, yet).

A little more aggressive approach is to fully
share a single buffer by all the parallel links.
But as it is not compatible with router architecture
today, I did not proposed the approach.

    Masataka Ohta




Re: Lossy cogent p2p experiences?

2023-09-06 Thread Masataka Ohta

William Herrin wrote:


I recognize what happens in the real world, not in the lab or text books.


What's the difference between theory and practice?


W.r.t. the fact that there are so many wrong theories
and wrong practices, there is no difference.


In theory, there is no difference.


Especially because the real world includes labs and text
books and, as such, all the theories including all the wrong
ones exist in the real world.

Masataka Ohta



Re: Lossy cogent p2p experiences?

2023-09-06 Thread Masataka Ohta

Saku Ytti wrote:


Fun fact about the real world, devices do not internally guarantee
order. That is, even if you have identical latency links, 0
congestion, order is not guaranteed between packet1 coming from
interfaceI1 and packet2 coming from interfaceI2, which packet first
goes to interfaceE1 is unspecified.


So, you lack fundamental knowledge on the E2E argument fully
applicable to situations in the real world Internet.

In the very basic paper on the E2E argument published in 1984:

End-To-End Arguments in System Design

https://groups.csail.mit.edu/ana/Publications/PubPDFs/End-to-End%20Arguments%20in%20System%20Design.pdf

reordering is recognized both as the real and the theoretical
world as:

3.4 Guaranteeing FIFO Message Delivery
Ensuring that messages arrive at the receiver in the same
order in which they are sent is another function usually
assigned to the communication subsystem.

which means, according to the paper, the "function" of
reordering by network can not be complete or correct, and,
unlike you, I'm fully aware of it.

> This is because packets inside lookup engine can be sprayed to
> multiple lookup engines, and order is lost even for packets coming
> from interface1 exclusively, however after the lookup the order is
> restored for _flow_, it is not restored between flows, so packets
> coming from interface1 with random ports won't be same order going out
> from interface2.

That is a broken argument for how identification of flows by
intelligent intermediate entities could work against the E2E
argument and the reality initiated this thread.

In the real world, according to the E2E argument, attempts to identify
flows by intelligent intermediate entities is just harmful from the
beginning, which is why flow driven architecture including that of
MPLS is broken and hopeless.

I really hope you understand the meaning of "intelligent intermediate
entities" in the context of the E2E argument.

    Masataka Ohta



Re: Lossy cogent p2p experiences?

2023-09-06 Thread Masataka Ohta

Mark Tinka wrote:


Are you saying you thought a 100G Ethernet link actually consisting
of 4 parallel 25G links, which is an example of "equal speed multi
parallel point to point links", were relying on hashing?


No...


So, though you wrote:

>> If you have multiple parallel links over which many slow
>> TCP connections are running, which should be your assumption,
>> the proper thing to do is to use the links with round robin
>> fashion without hashing. Without buffer bloat, packet
>> reordering probability within each TCP connection is
>> negligible.
>
> So you mean, what... per-packet load balancing, in lieu of per-flow
> load balancing?

you now recognize that per-flow load balancing is not a very
good idea.

Good.


you are saying that.


See above to find my statement of "without hashing".

Masataka Ohta



Re: Lossy cogent p2p experiences?

2023-09-05 Thread Masataka Ohta

Nick Hilliard wrote:


Are you saying you thought a 100G Ethernet link actually consisting
of 4 parallel 25G links, which is an example of "equal speed multi
parallel point to point links", were relying on hashing?


this is an excellent example of what we're not talking about in this 
thread.


Not "we", but "you".

A 100G serdes is an unbuffered mechanism which includes a PLL, and this 
allows the style of clock/signal synchronisation required for the 
deserialised 4x25G lanes to be reserialised at the far end.  This is one 
of the mechanisms used for packet / cell / bit spray, and it works 
really well.


That's why I, instead of fully shared buffer, mentioned round robin
as the proper solution for the case.

This thread is talking about buffered transmission links on routers / 
switches on systems which provide no clocking synchronisation and not 
even a guarantee that the bearer circuits have comparable latencies. 
ECMP / hash based load balancing is a crock, no doubt about it;


See the first three lines of this mail to find that I explicitly
mentioned "equal speed multi parallel point to point links" as the
context for round robin.

As I already told you:

: In theory, you can always fabricate unrealistic counter examples
: against theories by ignoring essential assumptions of the theories.

you are keep ignoring essential assumptions for no good purposes.

    Masataka Ohta



Re: Lossy cogent p2p experiences?

2023-09-04 Thread Masataka Ohta

William Herrin wrote:


Well it doesn't show up in long slow pipes because the low
transmission speed spaces out the packets,


Wrong. That is a phenomenon with slow access and fast backbone,
which has nothing to do with this thread.

If backbone is as slow as access, there can be no "space out"
possible.


and it doesn't show up in
short fat pipes because there's not enough delay to cause the
burstiness.


Short pipe means speed of burst shows up continuously
without interruption.

> So I don't know how you figure it has nothing to do with
> long fat pipes,

That's your problem.

    Masataka Ohta


Re: Lossy cogent p2p experiences?

2023-09-04 Thread Masataka Ohta

William Herrin wrote:


No, not at all. First, though you explain slow start,
it has nothing to do with long fat pipe. Long fat
pipe problem is addressed by window scaling (and SACK).


So, I've actually studied this in real-world conditions and TCP
behaves exactly as I described in my previous email for exactly the
reasons I explained.


Yes of course, which is my point. Your problem is that your
point of slow start has nothing to do with long fat pipe.

> Window scaling and SACK makes it possible for TCP to grow to consume
> the entire whole end-to-end pipe when the pipe is at least as large as
> the originating interface and -empty- of other traffic.

Totally wrong.

Unless the pipe is long and fat, a plain TCP without window scaling
or SACK is to grow to consume the entire whole end-to-end pipe when
the pipe is at least as large as the originating interface and
-empty- of other traffic.

> Those
> conditions are rarely found in the real world.

It is usual that TCP consumes all the available bandwidth.

Exceptions, not so rare in the real world, are plain TCPs over
long fat pipes.

    Masataka Ohta




Re: Lossy cogent p2p experiences?

2023-09-04 Thread Masataka Ohta

Mark Tinka wrote:


ECMP, surely, is a too abstract concept to properly manage/operate
simple situations with equal speed multi parallel point to point links.


I must have been doing something wrong for the last 25 years.


Are you saying you thought a 100G Ethernet link actually consisting
of 4 parallel 25G links, which is an example of "equal speed multi
parallel point to point links", were relying on hashing?

    Masataka Ohta



Re: Lossy cogent p2p experiences?

2023-09-04 Thread Masataka Ohta

William Herrin wrote:


Hi David,

That sounds like normal TCP behavior over a long fat pipe.


No, not at all. First, though you explain slow start,
it has nothing to do with long fat pipe. Long fat
pipe problem is addressed by window scaling (and SACK).

As David Hubbard wrote:

: I've got a non-rate-limited 10gig circuit

and

: The initial and recurring packet loss occurs on any flow of
: more than ~140 Mbit.

the problem is caused not by wire speed limitation of a "fat"
pipe but by artificial policing at 140M.

    Masataka Ohta



Re: Lossy cogent p2p experiences?

2023-09-04 Thread Masataka Ohta

Nick Hilliard wrote:


In this case, "Without buffer bloat" is an essential assumption.


I can see how this conclusion could potentially be reached in
specific styles of lab configs,


I'm not interested in how poorly you configure your
lab.


but the real world is more complicated and


And, this thread was initiated because of unreasonable
behavior apparently caused by stupid attempts for
automatic flow detection followed by policing.

That is the real world.

Moreover, it has been well known both in theory and
practice that flow driven architecture relying on
automatic detection of flows does not scale and is
no good, though MPLS relies on the broken flow
driven architecture.

> Generally in real world situations on the internet, packet reordering
> will happen if you use round robin, and this will impact performance
> for higher speed flows.

That is my point already stated by me. You don't have to repeat
it again.

> It's true that per-hash load
> balancing is a nuisance, but it works better in practice on larger
> heterogeneous networks than RR.

Here, you implicitly assume large number of slower speed flows
against your statement of "higher speed flows".

    Masataka Ohta



Re: Lossy cogent p2p experiences?

2023-09-03 Thread Masataka Ohta

Nick Hilliard wrote:


the proper thing to do is to use the links with round robin
fashion without hashing. Without buffer bloat, packet
reordering probability within each TCP connection is
negligible.


Can you provide some real world data to back this position up?


See, for example, the famous paper of "Sizing Router Buffers".

With thousands of TCP connections at the backbone recognized
by the paper, buffers with thousands of packets won't cause
packet reordering.

What you said reminds me of the old saying: in theory, there's no 
difference between theory and practice, but in practice there is.


In theory, you can always fabricate unrealistic counter examples
against theories by ignoring essential assumptions of the theories.

In this case, "Without buffer bloat" is an essential assumption.

    Masataka Ohta



Re: Lossy cogent p2p experiences?

2023-09-03 Thread Masataka Ohta

Mark Tinka wrote:

So you mean, what... per-packet load balancing, in lieu of per-flow load 
balancing?


Why, do you think, you can rely on existence of flows?


So, if you internally have 10 parallel 1G circuits expecting
perfect hashing over them, it is not "non-rate-limited 10gig".


It is understood in the operator space that "rate limiting" generally 
refers to policing at the edge/access.


And nothing beyond, of course.

The core is always abstracted, and that is just capacity planning and 
management by the operator.


ECMP, surely, is a too abstract concept to properly manage/operate
simple situations with equal speed multi parallel point to point links.

    Masataka Ohta



Re: Lossy cogent p2p experiences?

2023-09-03 Thread Masataka Ohta

Mark Tinka wrote:


Wrong. It can be performed only at the edges by policing total
incoming traffic without detecting flows.


I am not talking about policing in the core, I am talking about 
detection in the core.


I'm not talking about detection at all.

Policing at the edge is pretty standard. You can police a 50Gbps EoMPLS 
flow coming in from a customer port in the edge. If you've got N x 
10Gbps links in the core and the core is unable to detect that flow in 
depth to hash it across all those 10Gbps links, you can end up putting 
all or a good chunk of that 50Gbps of EoMPLS traffic into a single 
10Gbps link in the core, despite all other 10Gbps links having ample 
capacity available.


Relying on hash is a poor way to offer wide bandwidth.

If you have multiple parallel links over which many slow
TCP connections are running, which should be your assumption,
the proper thing to do is to use the links with round robin
fashion without hashing. Without buffer bloat, packet
reordering probability within each TCP connection is
negligible.

Faster TCP may suffer from packet reordering during slight
congestion, but the effect is like that of RED.

Anyway, in this case, the situation is:

:Moreover, as David Hubbard wrote:
:> I've got a non-rate-limited 10gig circuit

So, if you internally have 10 parallel 1G circuits expecting
perfect hashing over them, it is not "non-rate-limited 10gig".

    Masataka Ohta



Re: Lossy cogent p2p experiences?

2023-09-02 Thread Masataka Ohta

Mark Tinka wrote:

it is the 
core's ability to balance the Layer 2 payload across multiple links 
effectively.


Wrong. It can be performed only at the edges by policing total
incoming traffic without detecting flows.


While some vendors have implemented adaptive load balancing algorithms


There is no such algorithms because, as I wrote:

: 100 50Mbps flows are as harmful as 1 5Gbps flow.

Masataka Ohta


Re: Lossy cogent p2p experiences?

2023-09-02 Thread Masataka Ohta

Mark Tinka wrote:


On 9/1/23 15:59, Mike Hammett wrote:


I wouldn't call 50 megabit/s an elephant flow


Fair point.


Both of you are totally wrong, because the proper thing to do
here is to police, if *ANY*, based on total traffic without
detecting any flow.

100 50Mbps flows are as harmful as 1 5Gbps flow.

Moreover, as David Hubbard wrote:

> I’ve got a non-rate-limited 10gig circuit

there is no point of policing.

Detection of elephant flows were wrongly considered useful
with flow driven architecture to automatically bypass L3
processing for the flows, when L3 processing capability
were wrongly considered limited.

Then, topology driven architecture of MPLS appeared, even
though topology driven is flow driven (you can't put inner
labels of MPLS without knowing detailed routing information
at the destinations, which is hidden at the source through
route aggregation, on demand after detecting flows.)

    Masataka Ohta



Re: NTP Sync Issue Across Tata (Europe)

2023-08-14 Thread Masataka Ohta

Forrest Christian (List Account) wrote:


There are lots of ways to improve a GPS-based NTP server.  Better antenna
positioning.  Better GPS chipset.  Paying attention to antenna patterns.
  Adding notch filters to the GPS feed.  And so on.


They are not a very meaningful improvement.


But, in the end, there is nothing better than adding a second
GPS source at a diverse location as far as improving reliability, provided
that's an option based on timing needs.


You keep ignoring DOS attacks.

Though you wrote:

: If I just want to deny you time, it gets cheaper and
: easier.   All I need is a 1.2 GHz oscillator coupled to an
: antenna. There are units like this available for under $10,
: delivered.  These block GPS trackers on trucks and/or private
: automobiles.   Build your own and you can get a watt or two
: to shove into a tiny antenna for not a lot more. Guaranteed
: to Jam anything within a couple of blocks.

you don't understand similar effectiveness by DOS.


I can also attest that there is at least one overlap between time-nuts and
NANOG


See above.

Masataka Ohta



Re: NTP Sync Issue Across Tata (Europe)

2023-08-14 Thread Masataka Ohta

Mike Hammett wrote:


" As such, the ultimate (a little expensive) solution is to have
your own Rb clocks locally."



Yeah, that's a reasonable course of action for most networks.


For most data centers with time sensitive transactions, at least.


*sigh*


https://en.wikipedia.org/wiki/Atomic_clock
Modern rubidium standard tubes last more than ten years,
and can cost as little as US$50.

https://www.ebay.com/sch/i.html?_nkw=rubidium

    Masataka Ohta



Re: NTP Sync Issue Across Tata (Europe)

2023-08-13 Thread Masataka Ohta

John Gilmore wrote:


Subsequent conversation has shown that you are both right here.

Yes, many public NTP servers ARE using GPS-derived time.
Yes, some public NTP servers ARE NOT using GPS-derived time.


The point is whether

: 2) Run a set of internal NTPd servers, and configure them to pull
: time from all of your GPS-derived NTP servers, AND trusted public
: NTP servers

is a proper recommendation against total GPS failure or not.


At one point I proposed that some big NTP server pools be segregated by
names, to distinguish between GPS-derived time and national-standard
derived time.  For example, two domain names could be e.g.:

   fromnist.pool.tick.tock
   fromgps.pool.tick.tock


A problem is that a public NTP server, which is not necessarily
stratum 1, may depends on both.

Another problem is that domain name management is not so
trustworthy. An NTP server once relying on NIST may now
relying on GPS but an administrator of the server may not
change its domain name.

"trusted public NTP servers" is not a trustworthy or
verifiable concept.


PS: When we say "GPS", do we really mean any GNSS (global navigation
satellite system)?  There are now four such systems that have global
coverage, plus regionals.  While they attempt to coordinate their
time-bases and reference-frames, they are using different hardware and
systems, and are under different administration, so there are some
differences in the clock values returned by each GNSS.  These
differences and discontinuties have ranged up to 100ns in normal
operation, and higher amounts in the past.  See:


Because of the relativity, 100ns of time difference between
locations more than 30m apart can not be a problem for correct
transaction processing or ordering of events.

    Masataka Ohta



Re: NTP Sync Issue Across Tata (Europe)

2023-08-11 Thread Masataka Ohta

Forrest Christian (List Account) wrote:


The NIST time servers do NOT get their time from GPS.


No, of course. I know it very well.

However, as I wrote:

> But, additionally relying on remote servers (including those
> provided by NIST) is subject to DOS attacks.

the (mostly wired) Internet is just as secure/insecure as
wireless GPS, over which NIST servers can not be reliably
accessed.

Just as many people who only know wired Internet blindly
think wireless channels are secure, you can not recognize
various attack modes for the mostly wired internet.


These are physical realizations of UTC...  that is,  a phase-aligned 1PPS
pulse and a high precision clock signal.   These realizations are used to
directly drive the NIST NTP servers at each location.   GPS is not
involved.


UTC??? You are totally wrong.

Just as many other people, you are purposelessly seeking
meaningless accuracy assuming inertial frame of UTC,
which is *NOT* required for correct transactions

Because of relativity, we can assume *ANY* inertial frame
for simultaneity, which means simultaneity requirement is
not so strong.

Moreover, information cone allows even less simultaneity
for correct transactions.


These two timescales are within a few ns
of each other, also verified with GNSS common view technology, so one can
consider them the same for most purposes.


You don't understand simultaneity of theory of relativity at all.

10ns of time difference can not be physically or logically meaningful
between locations with 3m distance.


Note that a similar process is used to derive UTC(NICT) in Japan.


Depending on inertial system, time in US and JP can be different
a lot more than 1ms, which means timing error between mainland
US and Japan can be a lot more than 1ms.


As far as a rubidium clock goes, I'd much rather see it disciplined
regularly to a GPS time source, but that comes from the fact that I like my
1PPS to be within a microsecond or so of UTC due to the precision I need in
the lab.


As I already wrote:

: For millisecond accuracy, Rb clocks do not need any synchronization
: for centuries.
: Rb clocks on GPS are a lot more frequently synchronized, because
: a lot more accuracy is required for positioning (10ns of timing
: error means 3m of positioning error).

you didn't understand the required accuracy for the Internet operators,
which is your problem.


Note that some of the high end appliances I'm referring to just use GPS
over days and weeks to discipline a precision oscillator (sometimes
rubidium) which is essentially an automatic calibrating version of what
you're proposing.


That has nothing to do with the a lot more broad required accuracy
required by the theory of special relativity for proper causality.

    Masataka Ohta



Re: NTP Sync Issue Across Tata (Europe)

2023-08-11 Thread Masataka Ohta

Forrest Christian (List Account) wrote:


The recommendation tends to be the following:

1) Run your GPS-derived NTP appliances, but DO NOT point end-user
clients at it. 2) Run a set of internal NTPd servers, and configure
them to pull time from all of your GPS-derived NTP servers, AND
trusted public NTP servers 3) Point your clients at the internal NTPd
servers.


That is not a very good recommendation. See below.


At some point, using publicly available NTP sources is redundant
unless one wants to mitigate away the risks behind failure of the GPS
system itself.


Your assumption that public NTP servers were not GPS-derived NTP
servers is just wrong.


What I'm advocating against is the seemingly common practice to go
buy an off-the-shelf lower-cost GPS-NTP appliance (under $1K or so),
stick an antenna in a window or maybe on the rooftop, and point all
your devices at that device.


Relying on a local expensive GPS appliance does not improve
security so much and is the worst thing to do.

But, additionally relying on remote servers (including those
provided by NIST) is subject to DOS attacks.

As such, the ultimate (a little expensive) solution is to have
your own Rb clocks locally.

Masataka Ohta



Re: NTP Sync Issue Across Tata (Europe)

2023-08-09 Thread Masataka Ohta

John Gilmore wrote:


 I was also speaking specifically about installing GPS antennas in
 viable places, not using a facility-provided GPS or NTP service.


Am I confused?  Getting the time over a multi-gigabit Internet from a
national time standard agency such as NIST (or your local country's
equivalent) should produce far better accuracy and stability than
relying on locally received GPS signals.


When the (wrong) question is "how to build a stratum 1 server?",
that can not be an answer.


GPS uses very weak radio
signals which are regularly spoofed by all sorts of bad actors:


The question, seemingly, is not "how to build a secure stratum
1 server?".

BTW, the proper question should be "how to obtain secure time?".

    Masataka Ohta



Re: NTP Sync Issue Across Tata (Europe)

2023-08-08 Thread Masataka Ohta

Forrest Christian (List Account) wrote:


Depends on how synchronized you need to be.


Sure. But, we should be assuming NTP is mostly enough.


A rubidium oscillator or Chip Scale Atomic Clock is in the price range you
quote.  However, these can drift enough that you should occasionally
synchronize with a reference time source.  This is to ensure continued
millisecond accuracy.  Of course it all depends on how much drift you'll
tolerate, and if you're OK with being within a second, then a rubidium
might be ok.


For millisecond accuracy, Rb clocks do not need any synchronization
for centuries.

Rb clocks on GPS are a lot more frequently synchronized, because
a lot more accuracy is required for positioning (10ns of timing
error means 3m of positioning error).

Masataka Ohta



Re: NTP Sync Issue Across Tata (Europe)

2023-08-08 Thread Masataka Ohta

Mel Beckman wrote:

> To be useful, any atomic clocks you operate must be synchronized
> to a Stratum Zero time source, such as GPS.

Only initially.


Precise time is crucial to a variety of economic activities around
the world. Communication systems, electrical power grids, and
financial networks all rely on precision timing for synchronization
and operational efficiency. The free availability of GPS time has
enabled cost savings for companies that depend on precise time and
has led to significant advances in capability.


FYI, time difference between two points is not noticeable, that
is, does not affect correctness of any distributed algorithm,
if the difference is below the communication delay between
the points, which means rough synchronization by NTP is
good enough.

That is an information theoretic version of relativity of simultaneity:

https://en.wikipedia.org/wiki/Relativity_of_simultaneity

For information theoretic simultaneity, you can consider, instead
of light cone, information cone.

    Masataka Ohta


Re: NTP Sync Issue Across Tata (Europe)

2023-08-07 Thread Masataka Ohta

Forrest Christian (List Account) wrote:


In the middle tends to be a more moderate solution which involves a mix of
time transmission methods from a variety of geographically and/or network
diverse sources.  Taking time from the public trusted ntp servers and
adding lower cost GPS receivers at diverse points in your network seems
like a good compromise in the middle.  That way,  only coordinated attacks
will be successful.


Instead, just rely on atomic clocks operated by you. They are not
so expensive (several thousand dollars) and should be accurate
enough without adjustment for hundreds of years. There can be no
coordinated attacks. They may be remotely accessed through
secured NTP.

Masataka Ohta


Re: New addresses for b.root-servers.net

2023-06-21 Thread Masataka Ohta

Mark Andrews wrote:

>> If an end and another end directly share a secret
>> key without involving untrustworthy trusted third
>> parties, the ends are secure end to end.

>> An untrustworthy but light weight and inexpensive (or free)
>> PKI may worth its price and may be useful to make IP address
>> based security a little better.


Which you can do with DNSSEC but the key management will be enormous.


Which part of my message, are you responding? First part?

Though you might have forgotten, my initial proposal of DNSSEC
actually allows to use both public and shared keys.

Having hierarchical KDCs (Key Distribution Centers), instead
of hierarchical CAs, key management is not enormous.

Shared key is better than public key, because revocation
is instantaneous. Instead, root KDCs receive large amount
of requests. But, situation is similar to DNS root
servers today and is manageable.

Kerberos relies on KDCs.

However, the shared keys are shared by ends and intermediate
systems of KDCs, which is not end to end security.

    Masataka Ohta



Re: New addresses for b.root-servers.net

2023-06-20 Thread Masataka Ohta

Matt Corallo wrote:


As PKI, including DNSSEC, is subject to MitM attacks, is
not cryptographically secure, does not provide end to end
security and is not actually workable, why do you bother?


It sounds like you think nothing is workable, we simply cannot make 
anything secure


If an end and another end directly share a secret
key without involving untrustworthy trusted third
parties, the ends are secure end to end.

- if we should give up on WebPKI (and all its faults) 
and DNSSEC (and all its faults) and RPKI (and all its faults), what do 
we have left?


An untrustworthy but light weight and inexpensive (or free)
PKI may worth its price and may be useful to make IP address
based security a little better.

Masataka Ohta



Re: New addresses for b.root-servers.net

2023-06-20 Thread Masataka Ohta

Matt Corallo wrote:


So, let's recognize ISPs as trusted authorities and
we are reasonably safe without excessive cost to
support DNSSEC with all the untrustworthy hypes of
HSMs and four-eyes principle.


I think this list probably has a few things to say about "ISPs as 
trusted authorities" 


I'm afraid you miss the point.

My point is that trusted third parties of CAs including
DNSSEC providers are at least as untrustworthy as ISPs.

- is everyone on this list already announcing and 
enforcing an exact ASPA policy (or BGPSec or so) and ensuring the full 
path for each packet they send is secure and robust to ensure it gets to 
its proper destination?


I'm afraid that is a hype as bad as HSMs and four-eyes
principle.


Somehow I don't think this model is workable,


As PKI, including DNSSEC, is subject to MitM attacks, is
not cryptographically secure, does not provide end to end
security and is not actually workable, why do you bother?

    Masataka Ohta



Re: New addresses for b.root-servers.net

2023-06-19 Thread Masataka Ohta

Matt Corallo wrote:


Note that diginotar was advertised to be operated
with HSMs and four-eyes principle, which means
both of them were proven to be untrustworthy
marketing hypes.


Even more reason to do DNSSEC stapling!


See hypes of HSMs and four-eyes from DNSSEC
operators.

This is totally unrelated to the question at hand. There wasn't a 
question about whether a user relying on trusted authorities can maybe 
be whacked by said trusted authorities (though there's been a ton of 
work in this space, most notably requiring CT these days),


So, let's recognize ISPs as trusted authorities and
we are reasonably safe without excessive cost to
support DNSSEC with all the untrustworthy hypes of
HSMs and four-eyes principle.

it was purely 
about whether we can rely on pure "I sent a packet to IP X, did it get 
to IP X", which *is* solved by DNSSEC.


That's overkill. See above for the proper solution.

    Masataka Ohta



Re: New addresses for b.root-servers.net

2023-06-19 Thread Masataka Ohta

Matt Corallo wrote:


Both in theory and practice, DNSSEC is not secure end to
end


Indeed, but (a) there's active work in the IETF to change that (DNSSEC 
stapling to TLS certs)


TLS? What? As was demonstrated by diginotar, PKI is NOT
cryptographically secure and vulnerable to MitM attacks
on intermediate intelligent entities of CAs.

Note that diginotar was advertised to be operated
with HSMs and four-eyes principle, which means
both of them were proven to be untrustworthy
marketing hypes.

and (b) that wasn't the point - the above post 
said "It’s not like you can really trust your packets going to B _today_ 
are going to and from the real B (or Bs)." which is exactly what DNSSEC 
protects against!


As long as root key rollover is performed in time and
intermediate zones such as ccTLDs are not compromised,
maybe, which is why it is not very useful or secure.

The following description

https://en.wikipedia.org/wiki/DigiNotar
Secondly, they issued certificates for the Dutch
government's PKIoverheid ("PKIgovernment") program.
This issuance was via two intermediate certificates,
each of which chained up to one of the two "Staat der
Nederlanden" root CAs. National and local Dutch
authorities and organisations offering services for the
government who want to use certificates for secure internet
communication can request such a certificate. Some of the
most-used electronic services offered by Dutch governments
used certificates from DigiNotar. Examples were the
authentication infrastructure DigiD and the central
car-registration organisation Netherlands Vehicle
Authority [nl] (RDW).

makes it clear that entities operating ccTLDs may also
be compromised.

If its not useful, please describe a mechanism by which an average 
recursive resolver can be protected against someone hijacking C root on 
Hurricane Electric (which doesn't otherwise have the announcement at 
all, last I heard) and responding with bogus data?


As DNSSEC capable resolvers are not very secure, you don't
have to make plain resolvers so secure.


For example, root key rollover is as easy/difficult as
updating IP addresses for b.root-servers.net.


Then maybe read the rest of this thread, cause lots of folks pointed out 
issues with *just* updating the IP and not bothering to give it some 
time to settle :)


In this thread, I'm the first to have pointed out that old IP
addresses of root servers must be reserved (for 50 years).


    Masataka Ohta


Re: New addresses for b.root-servers.net

2023-06-18 Thread Masataka Ohta

Matt Corallo wrote:


That's great in theory, and folks should be using DNSSEC [1],


Wrong.

Both in theory and practice, DNSSEC is not secure end to
end and is not very useful.

For example, root key rollover is as easy/difficult as
updating IP addresses for b.root-servers.net.

Masataka Ohta



Re: New addresses for b.root-servers.net

2023-06-12 Thread Masataka Ohta

Mark Andrews wrote:


The commitment to maintain service for 1 year after the new LACNIC
addresses are switched in to the root.hints from IANA does not mean that
this is a cutoff date and that we intend to turn off service on the
older addresses after a year.  We currently have no plans to do so for
the foreseeable future. In fact, the possibility has not even been
suggested or discussed at all.


Such total lack of advance and public discussion and preparation
on a substantial change on critical infrastructure is a serious
problem, I'm afraid.



I'm curious about what more discussion you want to happen than has
happen in the past. Over the last 20 years there have been lots of
address changes.


If such changes are performed without proper transition plans
even after DNS became critical infrastructure (when?), they
also are serious problems.


None of them have caused operational problems.


Thank you for a devil's proof. That you haven't noticed any
problem does not mean there actually was no problem.

Masataka Ohta




Re: New addresses for b.root-servers.net

2023-06-08 Thread Masataka Ohta

Robert Story wrote:


The commitment to maintain service for 1 year after the new LACNIC
addresses are switched in to the root.hints from IANA does not mean that
this is a cutoff date and that we intend to turn off service on the
older addresses after a year.  We currently have no plans to do so for
the foreseeable future. In fact, the possibility has not even been
suggested or discussed at all.


Such total lack of advance and public discussion and preparation
on a substantial change on critical infrastructure is a serious
problem, I'm afraid.

Masataka Ohta



Re: New addresses for b.root-servers.net

2023-06-08 Thread Masataka Ohta

Mark Andrews wrote:


It announces itself to an address which remains under the control of
USC/ISI the current and on going root server operator for b.root-servers.net.
So apart from leaking that the root hints have not been updated I don’t
see a big risk here.  The address block, as has been stated, is in a reserved
range for critical infrastructure and, I suspect, has special controls placed
on it by ARIN regarding its re-use should USC/ISI ever release it / cease to
be a root-server operator.  I would hope that ARIN and all the RIRs have
the list of current and old root-server addresses and that any block that
are being transferred that have one of these addresses are flagged for
special consideration.


I'm afraid that "old root-server addresses" will not
be considered for "critical infrastructure" at least
by those people who can't see operational difficulties
to change the addresses.

    Masataka Ohta



Re: New addresses for b.root-servers.net

2023-06-01 Thread Masataka Ohta

William Herrin wrote:


Certainly we would appreciate other opinions about what the right length
of a change-over time would be, especially from the operational
communities that will be most impacted by this change.


Considering the possibility that, in a long run, remaining
12 sets (4 and 6) of IP addresses will also change, the proper
length should be determined assuming all the 13 sets of
addresses will change (not necessarily at the same time).


A server generation is about 3 years before it's obsolete and is
generally replaced. I suggest making the old address operable for two
generations (6 years) and black-holed for another generation (3 more
years).


You are assuming managed servers under Moore's law.

But, after Moore, a server generation will be longer.

Moreover, a linux-based black box, vendor of which has
disappeared, may be used for 10 or 20 years without being
managed.

Then, another important period is the period to reserve
the IP addresses once used for root servers. If the
addresses are reused by some bad guys, systems
depending on them can easily be compromised.

For the reservation period, 50 years of reservation
period of ISO3166 country codes seems to be reasonable.

And, if the addresses are reserved, there is no
reason not to keep using the addresses as
alternative addresses of active root name servers.

Masataka Ohta

PS

First of all, it is a bad idea to change the
addresses of root servers. For political ceremony, it
is enough to transfer address blocks to LACNIC.



Re: Spectrum (legacy TWC) Infrastructure - Contact Off List

2023-02-06 Thread Masataka Ohta

Mike Hammett wrote:

In no way is what I said wrong. Incumbent operators (coax or copper 
pairs) screw things up constantly (whether technically or in the 
business side of things), prompting a sea of independent operators

to overbuild them (or fill in where they haven't).


See below:

: https://en.wikipedia.org/wiki/Incumbent_local_exchange_carrier
: Various regional independents also held incumbent monopolies
: in their respective regions.

to know many independent operators are incumbent operators.


I don't mean non-RBOC ILECs. I mean WISPs, regional fiber operators,


I'm afraid "non-RBOC" is a synonym of "independent".

Anyway, ILECs including both RBOCs and thousands of non-RBOC ones
should be the regional fiber operators, as I already wrote:

: Many ILECs enjoying regional monopoly should be 100+ years old:

: https://en.wikipedia.org/wiki/Independent_telephone_company
: By 1903 while the Bell system had 1,278,000 subscribers on
: 1,514 main exchanges, the independents, excluding non-profit
: rural cooperatives, claimed about 2 million subscribers on
: 6,150 exchanges.[1]
: The size ranged from small mom and pop companies run by a
: husband and wife team, to large independent companies,

: many of which should now be PON operators still enjoying regional
: monopoly.

> Bob from down the street that retired and built a fiber company to
> serve his small town. I mean companies with less than 10,000
> customers and are younger than 20 years. There are literally
> thousands of them in the US and they're only getting more formidable
> in the face of lousy incumbents.

See above:

: The size ranged from small mom and pop companies run by a
: husband and wife team

Thousands of Bobs from down the street retired and built telephone
companies, now recognized as non-RBOC ILECs, to serve their small
towns 100+ years ago.

Newly coming Bobs can survive as regional fiber operators
only in regions not served by ILECs as PON providers.

    Masataka Ohta


Re: Smaller than a /24 for BGP?

2023-02-06 Thread Masataka Ohta

Michael Bolton via NANOG wrote:

> We would benefit from advertising /25's but it hurt's more
> than it helps.

That is, IPv6 really hurts.


I'm in the alarm industry and they still haven't started adopting
IPv6. If we allow /25 subnets, some industries will never change. In
a sense, we have to “force” them to change.


FYI, WRT routing table bloat, IPv6 having a lot longer minimum
allocation prefix than /24 (which forbid operators cut IPv6
prefixes longer than /24), that is, a lot beyond direct SRAM
look up, and, worse, needing longer TCAM word size (64 or 128
bits?) than IPv4, is, in a not so long run, a lot lot worse
than IPv4.

    Masataka Ohta


Re: Spectrum (legacy TWC) Infrastructure - Contact Off List

2023-02-06 Thread Masataka Ohta

Mike Hammett wrote:


Where did you think that condensation was going to get you in
this conversation?


I was involved in this thread because of your totally wrong
statement of:

: I selfishly hope they don't because that's where independent
: operators will succeed. ;-)

First of all, "Spectrum (legacy TWC)" is not a small company.

Moreover, as is stated in wikipedia that:

>https://en.wikipedia.org/wiki/Incumbent_local_exchange_carrier
>Various regional independents also held incumbent monopolies
>in their respective regions.

many independent operators are keep succeeding for 100+ years
not because they unreasonably cut maintenance cost but because
they have archived regional monopoly.

    Masataka Ohta



Re: Spectrum (legacy TWC) Infrastructure - Contact Off List

2023-02-05 Thread Masataka Ohta

Mike Hammett wrote:


Except there are literally thousands of independent ISPs in the US,

> many 10+ years old that aren't likely to be going anywhere and
> they are moving to constructing their own wireline.

Many ILECs enjoying regional monopoly should be 100+ years old:

https://en.wikipedia.org/wiki/Incumbent_local_exchange_carrier
Various regional independents also held incumbent monopolies
in their respective regions.

https://en.wikipedia.org/wiki/Independent_telephone_company
By 1903 while the Bell system had 1,278,000 subscribers on
1,514 main exchanges, the independents, excluding non-profit
rural cooperatives, claimed about 2 million subscribers on
6,150 exchanges.[1]
The size ranged from small mom and pop companies run by a
husband and wife team, to large independent companies,

many of which should now be PON operators still enjoying regional
monopoly.

So?

    Masataka Ohta


Re: Spectrum (legacy TWC) Infrastructure - Contact Off List

2023-02-05 Thread Masataka Ohta

Mike Hammett wrote:


Maybe it's not as hard as everyone says?


That's exactly the way of thinking by investors during
bubble.

It should be noted that corona virus not only caused
depression against which QE policy was chosen but also
forced people stay at home.

As such, investing on internet access seemed promising
and some money was also invested on high speed inexpensive
satellite internet, even though satellite internet must
be low speed or expensive.

Masataka Ohta



Re: Spectrum (legacy TWC) Infrastructure - Contact Off List

2023-02-04 Thread Masataka Ohta

Mike Hammett wrote:


Yet the independents are doing it anyway.


Petit bubble caused by quantitative easing, perhaps.

Masataka Ohta





-
Mike Hammett
Intelligent Computing Solutions

Midwest Internet Exchange

The Brothers WISP

- Original Message -

From: "Eric Kuhnke" 
To: "Forrest Christian (List Account)" 
Cc: "nanog list" 
Sent: Thursday, February 2, 2023 6:46:01 PM
Subject: Re: Spectrum (legacy TWC) Infrastructure - Contact Off List



It might look low cost until you look at a post-1980s suburb in the USA or 
Canada where 100% of the utilities are underground. There may be no fiber or 
duct routes. Just old coax used for DOCSIS3 owned/run by the local cable 
incumbent and copper POTS wiring belonging to the ILEC. The cost to retrofit 
such a neighborhood and reach every house with a fiber architecture can be 
quite high in construction and labor.







On Thu, Feb 2, 2023 at 9:14 AM Forrest Christian (List Account) < 
li...@packetflux.com > wrote:



The cost to build physical layer in much of the suburban and somewhat rural US 
is low enough anymore that lots of smaller, independent, ISPs are overbuilding 
the incumbent with fiber and taking a big chunk of their customer base because 
they are local and care. And making money while doing it.




On Thu, Feb 2, 2023, 8:22 AM Masataka Ohta < mo...@necom830.hpcl.titech.ac.jp > 
wrote:


Mike Hammett wrote:


I selfishly hope they don't because that's where independent
operators will succeed. ;-)


Because of natural regional monopoly at physical layer (cabling
cost for a certain region is same between competitors but their
revenues are proportional to their regional market shares), they
can't succeed unless the physical layer is regulated to be
unbundled, which is hard with PON.

But, in US where regional telephone network has been operated
by, unlike Europe/Japan, a private company enjoying natural
regional monopoly, economic situation today should be no worse
than that at that time.

Masataka Ohta










Re: Spectrum (legacy TWC) Infrastructure - Contact Off List

2023-02-02 Thread Masataka Ohta

Mike Hammett wrote:


I selfishly hope they don't because that's where independent
operators will succeed. ;-)


Because of natural regional monopoly at physical layer (cabling
cost for a certain region is same between competitors but their
revenues are proportional to their regional market shares), they
can't succeed unless the physical layer is regulated to be
unbundled, which is hard with PON.

But, in US where regional telephone network has been operated
by, unlike Europe/Japan, a private company enjoying natural
regional monopoly, economic situation today should be no worse
than that at that time.

Masataka Ohta


Re: Smaller than a /24 for BGP?

2023-01-29 Thread Masataka Ohta

I wrote:


So,
another way of multihoming critically depends on replacing the layer-4
protocols with something that doesn't intermingle the IP address with
the connection identifier.


Wrong. As is stated in my ID that:

    On the other hand, with end to end multihoming, multihoming is
    supported by transport (TCP) or application layer (UDP etc.) of end
    systems and does not introduce any problem in the network and works
    as long as there is some connectivity between the end systems.

end to end multihoming may be supported at the application layer
by trying all the available addresses, which is what DNS and
SMTP are actually doing.


To my surprise, I've found that the current (2017) happy eyeball
already does so as is stated in rfc8305:

: Appendix A.  Differences from RFC 6555
:o  how to handle multiple addresses from each address family

So, we are ready for end to end multihoming for which multiple
PA addresses are enough and /24 is not necessary. Though
not all the application protocols may support it, DNS, SMTP
and HTTP(S) should be good enough as a starter.

It should be noted that happy eyeball strongly depends on
DNS, even though someone might think DNS not guaranteed.

Your web server is multihomed if you assign it PA
addresses assigned from multiple ISPs and register
the addresses to DNS. You don't have to manage BGP.


TCP modification is just an option useful for long lasting
TCP connections.


A major obstacle for it, as most of you can see, is that there
are people who can't distinguish IP address changes by mobility
and by multihoming. Such people will keep reinventing MPTCP.

Masataka Ohta



Re: Smaller than a /24 for BGP?

2023-01-28 Thread Masataka Ohta

William Herrin wrote:


 The easiest way for applications know all the addresses of the
 destination is to use DNS. With DNS reverse, followed by forward,
 lookup, applications can get a list of all the addresses of the
 destination from an address of the destination.


The DNS provides no such guarantee.


Guarantee for what?

Remember that we have been enjoying secure confirmation that
certain IP address belongs to certain hostname by DNS reverse
look up without any guarantee.

> Moreover, the DNS does guarantee
> its information to be correct until the TTL expires, making it
> unsuitable for communicating address information which may change
> sooner.

I'm afraid you know very little about DNS operation. See rfc1034:

   If a change can be anticipated, the TTL can be reduced prior
   to the change to minimize inconsistency during the change,
   and then increased back to its former value following the
   change.

which is the way to operate DNS when host addresses are changing,
for example, by multihoming configuration changes.

In addition, when a dual homed site with end to end multihoming
changes one of its ISP, it is a good idea to offer all the three
addresses by DNS during the change. Make before break.


 With TCP, applications must be able to pass multiple addresses to
 transport layer (e.g. BSD socket).

which implies addresses are supplied from applications by
DNS look up.


Which is a bit of hand-waving since the protocol can't do anything
with that information regardless of whether you expand the API to
provide it.


Read my draft, which explains how TCP should be modified.

    Masataka Ohta



Re: Smaller than a /24 for BGP?

2023-01-28 Thread Masataka Ohta

William Herrin wrote:


Use Multipath TCP
https://datatracker.ietf.org/group/mptcp/documents/


Doesn't work well. Has security problems (mismatch between reported IP
addresses used and actual addresses in use) and it can't reacquire the
opposing endpoint if an address is lost before a new one is
communicated.


It merely means MPTCP is wrongly architected. Dynamically changing
IP addresses is for mobility (if you don't mind location privacy),
not for multihoming.

The following way in my ID:

   The easiest way for applications know all the addresses of the
   destination is to use DNS. With DNS reverse, followed by forward,
   lookup, applications can get a list of all the addresses of the
   destination from an address of the destination.

does not have any such problem and should be as safe as
happy eyeball for two or more IPv4/IPv6 addresses.

As for (long lasting) TCP, my ID says:

   With TCP, applications must be able to pass multiple addresses to
   transport layer (e.g. BSD socket).

which implies addresses are supplied from applications by
DNS look up.

Though a client may, at the time TCP connection is established,
send a list of its IP addresses to a server, which may have some
security complications, it is simpler to let the server just
rely on DNS:

   With DNS reverse, followed by forward,
   lookup, applications can get a list of all the addresses of the
   destination from an address of the destination.

As I pointed out in the previous mail, DNS already supports
end to end multihoming at the application layer to try all
the addresses of name servers, on which other applications
can safely rely.

Masataka Ohta



Re: Smaller than a /24 for BGP?

2023-01-28 Thread Masataka Ohta

William Herrin wrote:


That multihomed sites are relying on the entire Internet
for computation of the best ways to reach them is not
healthy way of multihoming.


This was studied in the IRTF RRG about a decade ago. There aren't any

> other workable ways of multihoming compatible with the TCP protocol,
> not even in theory.

A decade? The problem and the solution was thoroughly studied by me
long ago and the first ID was available already in 2000.

The 5th version is here:

https://datatracker.ietf.org/doc/html/draft-ohta-e2e-multihoming-05.txt

I've found that you can access the first one by "Compare
versions" feature of the web page.


So,
another way of multihoming critically depends on replacing the layer-4
protocols with something that doesn't intermingle the IP address with
the connection identifier.


Wrong. As is stated in my ID that:

   On the other hand, with end to end multihoming, multihoming is
   supported by transport (TCP) or application layer (UDP etc.) of end
   systems and does not introduce any problem in the network and works
   as long as there is some connectivity between the end systems.

end to end multihoming may be supported at the application layer
by trying all the available addresses, which is what DNS and
SMTP are actually doing.

TCP modification is just an option useful for long lasting
TCP connections.

    Masataka Ohta



Re: Smaller than a /24 for BGP?

2023-01-27 Thread Masataka Ohta

Lars Prehn wrote:


Accepting and globally redistributing all hyper-specifics increases
the routing table size by >100K routes (according to what route
collectors see).


That figure is guaranteed minimum but there should be 10 or
100 times more desire for hyper-specifics suppressed by
the established (since early days with class C) practice.

That multihomed sites are relying on the entire Internet
for computation of the best ways to reach them is not
healthy way of multihoming.

    Masataka Ohta


Re: Smaller than a /24 for BGP?

2023-01-24 Thread Masataka Ohta

Jon Lewis wrote:


Yeah, but in another couple years we'll breach the 1M mark and
everybody will have fresh routers with lots of TCAM for a while. If
that were the only issue, it'd be a matter of timing the change well.


Everybody will need them.  Not all will get (or be able to get) them.


Wrong. For /24, direct look up of 16M entry SRAM is enough.
Updating 64K entries for /8 should not be a problem, though
you may also have 64K entry SRAM for /16.

In addition, for small number of local smaller-than-/24
prefixes, another lookup of radix tree by a smaller SRAM
(with 64K entry, we can subdivide 256 /24 into /32)
should be possible.

But, there is no need for costly and power wasting TCAM.

So far, I ignore IPv6, of course.

Masataka Ohta



Re: Starlink routing

2023-01-24 Thread Masataka Ohta

Jorge Amodio wrote:


You, seemingly, do not have much knowledge on UUNET.

Of course I don't :-)



atina   agomar(DAILY), antar(DAILY), biotlp(DAILY), cab(HOURLY),
 cedro(EVENING), cenep(DAILY), cneaint(DAILY), cnea(EVENING),
 cnielf(DAILY), colimpo(DAILY), confein(DAILY), criba(EVENING),
 curbre(EVENING), dacfyb(DEMAND), dcfcen(DEMAND), ecord(DEMAND),
 enace(DAILY), epfrn(EVENING), fb1(DAILY), fcys(DAILY), fecic(DAILY),
 gagcha(EVENING), getinfo(DAILY), hasar(DAILY), iaros(DAILY),
 intiar(DAILY), invapba(DAILY/2), invapqq(DAILY/2), isoft(DAILY),
 itcgi(DAILY), labdig(DAILY), lasbe(DAILY), licmdp(EVENING),
 lis(EVENING), ludo(DAILY), maap(DAILY), meyosp(DAILY),
 minerva(DAILY), minjus(DAILY), mlearn(DAILY), occam(EVENING),
 oceanar(DAILY), onba(DAILY), opsarg(DEMAND), pnud009(EVENING),
 sadio(DAILY), saravia(DAILY), sdinam(DAILY), secyt(DEMAND),
 spok(DAILY), sykes(DAILY), tandil(DAILY), tsgfred(WEEKLY),
 ulatar(EVENING), unisel(EVENING), uunet(DEMAND)


So, you now remember that UUCP links were scheduled.

Masataka Ohta




Re: Starlink routing

2023-01-23 Thread Masataka Ohta

Jorge Amodio wrote:


This gets sort of merged with DTN (Delay/Disruption Tolerant Networking.)


I have been saying that DTN is a reinvention of UUNET.



Hmmm, nope not even close.


You, seemingly, do not have much knowledge on UUNET.

As such, it should be noted that, in UUNET, availability of
phone links between computers was scheduled.



You must be talking about UCCP, UUNET was a company.


Why, do you think, UUNET as a company named so?

It was an organization to offer connectivity to UseNET but some
used the word to just mean UseNET. See, for example:

https://docs.oracle.com/cd/E19957-01/805-4368/gavzo/index.html
UUNET
(n.) A network that carries electronic newsgroups, aggregates
of many electronic messages that are sorted by topic, to
thousands of users on hundreds of workstations worldwide.


Availability of links was declared not scheduled,

Declared in map files used by pathalias? But that's not my point.

UUCP links were not permanent but scheduled. See, for example:


https://www.ibm.com/docs/en/zos/2.4.0?topic=systems-schedule-periodic-uucp-transfers-cron
Schedule periodic UUCP transfers with cron


so pathalias was able to
figure the best UUCP path from a given UUCP node.


Such initial attempts were not so elegant or scalable.

UUCP networks as DTN were brought to perfection through
integration with the Internet relying on DNS MX RRs.

Masataka Ohta



Re: Starlink routing

2023-01-23 Thread Masataka Ohta

Jorge Amodio wrote:


We are in the process of starting a new Working Group at IETF, Timer
Variant Routing or TVR.
https://datatracker.ietf.org/group/tvr/about/

Some of the uses cases are for space applications where you can predict or
schedule the availability and capacity of "links" (radio, optical)


Even though the current routing protocols have no difficulty
to treat unpredictable/unscheduled changes on links?


This gets sort of merged with DTN (Delay/Disruption Tolerant Networking.)


I have been saying that DTN is a reinvention of UUNET.

As such, it should be noted that, in UUNET, availability of
phone links between computers was scheduled.

    Masataka Ohta



Re: Starlink routing

2023-01-23 Thread Masataka Ohta

Matthew Petach wrote:


Unlike most terrestrial links, the distances between satellites are
not fixed, and thus the latency between nodes is variable, making the
concept of "Shortest Path First" calculation a much more dynamic and
challenging one to keep current, as the latency along a path may be
constantly changing as the satellite nodes move relative to each
other, without any link state actually changing to trigger a new SPF
calculation.


As LEO satellites should be leafs to a network of MEO satellites,
1 minutes of update period between MEO satellites should be enough,
which is not so dynamic.

Physical layer of MEO communications must (to save power and to
prevent broadcast storms) be point to point with known orbital
elements and link layer should be some point to point protocol
perhaps with ARQ.

As only meaningful metric between satellites is physical
distance, 16bit metric of OSPF should be enough.

The most annoying part is to have multiple ground stations,
which, as usual, makes the MEO network DFZ with more than 1M
routing table entries.

    Masataka Ohta


Re: A straightforward transition plan (was: Re: V6 still not supported)

2023-01-13 Thread Masataka Ohta

Pascal Thubert (pthubert) wrote:

Hi,


Solutions must first avoid broadcast as much as possible, because
there's also the cost of it.


Though I'm not saying all the broadcast must be repeated,
if you think moderate broadcast is costly, just say,
CATENET.

I remember old days when entire network of CERN with
thousands of hosts was managed to be a single Ethernet
several years after we learned dividing network by
routers can prevent various problems caused by broadcast.

It was, at least partly, because operating multi-protocol
routers is painful. Unlike most sites at that time, non
IP protocols such as DECnet was popular at CERN.

As IPv4 became dominant, problems went away.


Then you want zerotrust, ND is so easy to
attack from inside and even outside. This is RFC 8928.


As many people are saying zerotrust relying on PKI, which
blindly trust CAs as TPPs (trusted third parties), which
are confirmed-to-be-untrustworthy third parties by
Diginotar, zerotrust is not very meaningful beyond
marketing hype.

Anyway, relying on link broadcast implies that the link
is trusted to some extent, which is not ND specific.


Ethernet is enterprise networks is largely virtualized. We cannot
offer fast and reliable broadcast services on a worldwide overlay.


Unlike CERN in the past, today, I can see no point to have large
Ethernet, though some operators may be hyped to deploy expensive
service of telco for nothing.


Add to that the desire by the device to own more and more addresses.


What? How can it happen with IPv4?


You want a contract between that the host and the network that the
host owns an address and is reachable at that address. Like any
contract, that must be a negotiation. ND is not like that. RFC 8505
is exactly that.


Ignoring poor IPv6, I'm afraid it a property of not ARP but DHCP.


It may be more constructive to work for proxy ARP suitable for
Wifi, which may be enforced by Wifi alliance. An RFC may be
published if Wifi industry request IETF to do so.


This is effectively done already for ND.


I agree with you but my point is that it is more constructive for ARP.


I guess the design can be easily retrofitted to ARP. ND is really
designed exactly as ARP. The differences were for the show, the real
steps that should have been made were not. But now with RFC 8505 we
have a modern solution. The problem is no more on the standard side,
it is adoption. People will not move if it does not hurt enough. And
they can bear a lot.


But, for adoption, some formal document, not necessarily a (standard
track) rfc, is necessary.

Masataka Ohta


Re: A straightforward transition plan (was: Re: V6 still not supported)

2023-01-12 Thread Masataka Ohta

Pascal Thubert (pthubert) wrote:

Hi,


For that issue at least there was some effort. Though ATM and FR
appear to be long gone, the problem got even worse with pseudo wires
/ overlays and wireless.

It was tackled in the IoT community 10+ years ago and we ended up
with RFC 8505 and 8928. This is implemented in LoWPAN devices and
deployed by millions. Allowing IPv6 subnets of thousands on
constrained radios.


When I mentioned a problem for the first time in IPng or IPv6
(I can't find any archive, are there any?) list, Christian
Huitema mentioned it could be solved by ND over NBMA but
the problem is not NB but broadcast of Wifi is unreliable.

As such, the solutions should be based on a fact that
repeated unreliable broadcast is reliable.


I spent a bit of time explaining the architecture issue (in mild
terms) and solutions in
https://datatracker.ietf.org/doc/html/draft-thubert-6man-ipv6-over-wireless-12.


Though you wrote in the draft:

Reducing the speed at the physical (PHY) layer for
broadcast transmissions can increase the reliability

longer packets mean more collision (with hidden terminals)
probability and less reliability.

A link broadcast domain must be same for all the members
of the link and should be defined as set of terminals which
can receive broadcast from a central station (or, stations)
with certain probability, which is why Wifi broadcast is
relayed by a central station.


 So far we failed to get those RFCs implemented on the major stacks
for WiFi or Ethernet.


Ethernet? Even though its broadcast is reliable?

Though Wifi bridged by Ethernet may have its own problems,
they are Wifi-specific problems.


There’s a new thread at IETF 6MAN just now on adopting just the draft
above - not even the solution. It is facing the same old opposition
from the same few and a lot of silence.


You can't expect people still insisting on IPv6 as is much.


My suggestion is still to fix IPv6 as opposed to drop it, because I
don’t see that we have another bullet to fire after that one. For
that particular issue of fixing ND, new comments and support at the
6MAN on the draft above may help.


It may be more constructive to work for proxy ARP suitable
for Wifi, which may be enforced by Wifi alliance. An RFC
may be published if Wifi industry request IETF to do so.

Masataka Ohta


Re: A straightforward transition plan (was: Re: V6 still not supported)

2023-01-11 Thread Masataka Ohta

Randy Bush wrote:


three of the promises of ipng which ipv6 did not deliver
   o compatibility/transition,
   o security, and
   o routing & renumbering


You miss a promise of

   o ND over ATM/NBMA

which caused IPv6 lack a notion of link broadcast.

Masataka Ohta



Re: SDN Internet Router (sir)

2023-01-11 Thread Masataka Ohta

Mike Hammett wrote:


" With plain IP routers?"



Yes, or, well, relatively plain, depending on the implementation.


As completely plain routers have no difficulty to treat a
default route, it is a waste of money and effort to try to
have not so plain routers to do so regardless of whether the
routers are SDN ones or not.

    Masataka Ohta



Re: SDN Internet Router (sir)

2023-01-07 Thread Masataka Ohta

Matthew Walster wrote:


No... It's action based. You can send it a different route, you can
replicate it, you can drop it, you can mutate it...


Replication is a poor alternative for multicast.



You conveniently ignore things like IDS, port mirroring, things like that.


Wrong. Instead, you conveniently ignore that such forwarding
requires a link between an SDN router and a monitoring device
have the same or larger MTU than an incoming link of the SDN
router, which means the router and the monitoring device must
be tightly coupled effectively to be a single device.

Sometimes, packet loss possibility between them often requires
they must actually be the same device.


No. There are far more actions than for prioritisation.


Just for fun? I'm afraid I already mentioned so.


What if you want to make sure certain classes of traffic do not flow over a
link, because it is unencrypted and/or sensitive, but you're happy to send
as much TLS wrapped data as you like?


You are wrongly assuming TLS wrapped packets can be identified
packet by packet, as I wrote:

>> Unless pattern is as simple as having certain port number,
>> stateful filtering almost always needs all packets including
>> those matching expected pattern, I'm afraid.

So?


What if you want to sample some flows in an ERSPAN like mechanism?


See above for MTU issues.


What if you want to urgently drop a set of flows based on a known DDOS
signature?


Urgently? Even though a DDOS signature is known in advance?

Why?


Unless pattern is as simple as having certain port number,
stateful filtering almost always needs all packets including
those matching expected pattern, I'm afraid.



Or a certain set of IP addresses. Policy based routing.


That's even simpler than port number to be treated by
having or not having proper routing table entries.


If default route is acceptable, just rely on it along with
50 non default routes with plain IP routers.



That's what OP is suggesting.


With plain IP routers?


That's what SIR is. Classifying prefixes by
traffic and only keeping the ones with the highest volume of traffic,
discarding the rest, relying on the default route to infill.


Given the connectionless nature of the Internet, route change based
on volume of traffic averaged over certain period of time is rather
harmful than useful.

    Masataka Ohta


Re: SDN Internet Router (sir)

2023-01-07 Thread Masataka Ohta

Matthew Walster wrote:


No... It's action based. You can send it a different route, you can
replicate it, you can drop it, you can mutate it...


Replication is a poor alternative for multicast.

For other actions, why, do you think, they are performed?

Just for fun? Or to differentiate treatment of some packets,
that is, prioritization?


You can send it to a
different destination for stateful filtering when it doesn't match an
expected pattern!


Unless pattern is as simple as having certain port number,
stateful filtering almost always needs all packets including
those matching expected pattern, I'm afraid.


SDN is not just QoS routing, please stop saying that.


See above.


Nope, not true. Had 1000 routes, only 100 available in FIB. So you filter
to the top 50 doing traffic and default route the rest of the traffic. Less
entries.


If default route is acceptable, just rely on it along with
50 non default routes with plain IP routers.

Masataka Ohta



Re: SDN Internet Router (sir)

2023-01-06 Thread Masataka Ohta

Matthew Walster wrote:


SDN does not imply QoS routing,


As long as the shortest path is comfortable enough, no, it
does not have to.


it's just one aspect of it. Some use it for
classifying guest traffic etc.


If special path is provided for guest or otherwise
prioritized traffic, that's QoS routing.

Anyway, prioritization needs more, not less,
routing table entries.

Masataka Ohta



Re: SDN Internet Router (sir)

2023-01-06 Thread Masataka Ohta

Christopher Morrow wrote:


Some of the reasoning behind 'i need/want to do SDN things' is 'low fib
device' sort of reasonings.


What?

SDN is a poor alternative for those who can't construct a
network with fully automated QoS guarantee.

Even with SDN, QoS guarantee implies QoS routing requiring
dedicated routing table entry for each flow, which will not
shrink but bloat routing tables regardless of whether you
call it FIB or not.

Masataka Ohta



Re: Large RTT or Why doesn't my ping traffic get discarded?

2022-12-22 Thread Masataka Ohta

Jerry Cloe wrote:


Because there is no standard for discarding "old" traffic, only
discard is for packets that hop too many times. There is, however, a
standard for decrementing TTL by 1 if a packet sits on a device for
more than 1000ms, and of course we all know what happens when TTL

> hits zero. Based on that, your packet could have floated around for
> another 53 seconds.

Totally wrong as the standard says TTL MUST be decremented at least
by one on every hop and TTL MAY NOT be decremented further as is
specified by the standard of IPv4 router requirements (rfc1812):

   When a router forwards a packet, it MUST reduce the TTL by at least
   one.  If it holds a packet for more than one second, it MAY decrement
   the TTL by one for each second.

As for IPv6,

   Unlike IPv4, IPv6 nodes are not required to enforce maximum packet
   lifetime.  That is the reason the IPv4 "Time to Live" field was
   renamed "Hop Limit" in IPv6.  In practice, very few, if any, IPv4
   implementations conform to the requirement that they limit packet
   lifetime, so this is not a change in practice.

    Masataka Ohta



Re: Alternative Re: ipv4/25s and above Re: 202211232221.AYC

2022-11-28 Thread Masataka Ohta

Vasilenko Eduard via NANOG wrote:


Big OTTs installed caches all over the world.
Big OTTs support IPv6.


As large network operational cost to support IPv6 is
negligible for OTTs spending a lot more money at the
application layer, they may.


Hosts prefer IPv6.


No.

As many retail ISPs can not afford operational cost of
IPv6, they are IPv4 only, which makes hosts served by
them IPv4 only.

Possible exceptions are ISPs offering price (not
necessarily value) added network services in
noncompetitive environment. But, end users suffer
from the added price.

Masataka Ohta



Re: Jon Postel Re: 202210301538.AYC

2022-11-05 Thread Masataka Ohta

William Allen Simpson wrote:


Something similar happened with IPv6.  Cisco favored a design where only
they had the hardware mechanism for high speed forwarding.  So we're
stuck with 128-bit addresses and separate ASNs.


Really?

Given that high speed forwarding at that time meant TCAM,
difference between 128 bit address should mean merely twice
more TCAM capacity than 64 bit address.

I think the primary motivation for 128 bit was to somehow
encode NSAP addresses into IPng ones as is exemplified
by RFC1888. Though the motivation does not make any
engineering sense, IPv6 neither.

Masataka Ohta



Re: 400G forwarding - how does it work?

2022-08-12 Thread Masataka Ohta

sro...@ronan-online.com wrote:


How do you propose to fairly distribute market data feeds to the > market if 
not multicast?


Unicast with randomized order.

To minimize latency, bloated buffer should be avoided
and TCP with configured small (initial) RTT should be
used.

    Masataka Ohta


Re: 400G forwarding - how does it work?

2022-08-09 Thread Masataka Ohta

Dave Taht wrote:


But as fair queuing does not scale at all, they disappeared
long ago.


What do you mean by FQ, exactly?


Fair queuing is "fair queuing" not some queuing idea
which is, by someone, considered "fair".

See, for example,

https://en.wikipedia.org/wiki/Fair_queuing


"5 tuple FQ" is scaling today


Fair queuing does not scale w.r.t. the number of queues.

    Masataka Ohta


Re: 400G forwarding - how does it work?

2022-08-09 Thread Masataka Ohta

Matthew Huff wrote:


Also, for data center traffic, especially real-time market data and
other UDP multicast traffic, micro-bursting is one of the biggest
issues especially as you scale out your backbone.


Are you saying you rely on multicast even though loss of a packet
means loss of large amount of money?

Is it a reason why you use large buffer to eliminate possibilities
of packet dropping caused by buffer overflow but not by other
reasons?

Masataka Ohta


Re: 400G forwarding - how does it work?

2022-08-09 Thread Masataka Ohta

Saku Ytti wrote:


With such an imaginary assumption, according to the end to end
principle, the customers (the ends) should use paced TCP instead



I fully agree, unfortunately I do not control the whole problem
domain, and the solutions available with partial control over the
domain are less than elegant.


OK. But, you should be aware that, with bloated buffer, all
the customers sharing the buffer will suffer from delay.

Masataka Ohta



Re: 400G forwarding - how does it work?

2022-08-08 Thread Masataka Ohta

Saku Ytti wrote:


which is, unlike Yttinet, the reality.


Yttinet has pesky customers who care about single TCP performance over
long fat links, and observe poor performance with shallow buffers at
the provider end.


With such an imaginary assumption, according to the end to end
principle, the customers (the ends) should use paced TCP instead
of paying unnecessarily bloated amount of money to intelligent
intermediate entities of ISPs using expensive routers with
bloated buffers.


Yttinet is cost sensitive and does not want to do
work, unless sufficiently motivated by paying customers.


I understand that if customers follow the end to end principle,
revenue of "intelligent" ISPs will be reduced.

    Masataka Ohta





Re: 400G forwarding - how does it work?

2022-08-08 Thread Masataka Ohta

Saku Ytti wrote:


If RTT is large, your 100G runs over several 100/400G
backbone links with many other traffic, which makes the
burst much slower than 10G.


In Ohtanet, I presume.


which is, unlike Yttinet, the reality.

Masataka Ohta


Re: 400G forwarding - how does it work?

2022-08-08 Thread Masataka Ohta

Saku Ytti wrote:


When many TCPs are running, burst is averaged and traffic
is poisson.


If you grow a window, and the sender sends the delta at 100G, and
receiver is 10G, eventually you'll hit that 10G port at 100G rate.


Wrong. If it's local communicaiton where RTT is small, the
window is not so large smaller than unbloated router buffer.
If RTT is large, your 100G runs over several 100/400G
backbone links with many other traffic, which makes the
burst much slower than 10G.

Masataka Ohta


Re: 400G forwarding - how does it work?

2022-08-08 Thread Masataka Ohta

dip wrote:


I have seen cases where traffic behaves
more like self-similar.


That could happen if there are small number of TCP streams
or multiple TCPs are synchronized through interactions on
bloated buffers, which is one reason why we should avoid
bloated buffers.


Do you have any good pointers where the research has been done that today's
internet traffic can be modeled accurately by Poisson? For as many papers
supporting Poisson, I have seen as many papers saying it's not Poisson.

https://www.icir.org/vern/papers/poisson.TON.pdf


It is based on observations between 1989 and 1994 when
Internet backbone was slow and the number of users
was small, which means the number of TCP streams
running in parallel is small.

For example, merely 124M packets for 36 days of observation
[LBL-1], is slower than 500kbps, which can be filled
up by a single TCP connection even by computers at that
time and is not a meaningful measurement.


https://www.cs.wustl.edu/~jain/cse567-06/ftp/traffic_models2/#sec1.2


It merely states that some use non Poisson traffic models.

Masataka Ohta


Re: 400G forwarding - how does it work?

2022-08-08 Thread Masataka Ohta

sro...@ronan-online.com wrote:


There are MANY real world use cases which require high throughput at
64 byte packet size.


Certainly, there were imaginary world use cases which require
to guarantee so high throughput of 64kbps with 48B payload
size for which 20(40)B IP header was obviously painful and 5B
header was used. At that time, poor fair queuing was assumed,
which requires small packet size for short delay.

But as fair queuing does not scale at all, they disappeared
long ago.

> Denying those use cases because they don’t fit
> your world view is short sighted.

That could have been a valid argument 20 years ago.

    Masataka Ohta


Re: 400G forwarding - how does it work?

2022-08-07 Thread Masataka Ohta

Saku Ytti wrote:


I'm afraid you imply too much buffer bloat only to cause
unnecessary and unpleasant delay.

With 99% load M/M/1, 500 packets (750kB for 1500B MTU) of
buffer is enough to make packet drop probability less than
1%. With 98% load, the probability is 0.0041%.



I feel like I'll live to regret asking. Which congestion control
algorithm are you thinking of?


I'm not assuming LAN environment, for which paced TCP may
be desirable (if bandwidth requirement is tight, which is
unlikely in LAN).


But Cubic and Reno will burst tcp window growth at sender rate, which
may be much more than receiver rate, someone has to store that growth
and pace it out at receiver rate, otherwise window won't grow, and
receiver rate won't be achieved.


When many TCPs are running, burst is averaged and traffic
is poisson.


So in an ideal scenario, no we don't need a lot of buffer, in
practical situations today, yes we need quite a bit of buffer.


That is an old theory known to be invalid (Ethernet switches with
small buffer is enough for IXes) and theoretically denied by:

Sizing router buffers
https://dl.acm.org/doi/10.1145/1030194.1015499

after which paced TCP was developed for unimportant exceptional
cases of LAN.

> Now add to this multiple logical interfaces, each having 4-8 queues,
> it adds up.

Having so may queues requires sorting of queues to properly
prioritize them, which costs a lot of computation (and
performance loss) for no benefit and is a bad idea.

> Also the shallow ingress buffers discussed in the thread are not delay
> buffers and the problem is complex because no device is marketable
> that can accept wire rate of minimum packet size, so what trade-offs
> do we carry, when we get bad traffic at wire rate at small packet
> size? We can't empty the ingress buffers fast enough, do we have
> physical memory for each port, do we share, how do we share?

People who use irrationally small packets will suffer, which is
not a problem for the rest of us.

    Masataka Ohta




Re: 400G forwarding - how does it work?

2022-08-07 Thread Masataka Ohta

ljwob...@gmail.com wrote:


Buffer designs are *really* hard in modern high speed chips, and
there are always lots and lots of tradeoffs.  The "ideal" answer is
an extremely large block of memory that ALL of the
forwarding/queueing elements have fair/equal access to... but this
physically looks more or less like a full mesh between the
memory/buffering subsystem and all the forwarding engines, which
becomes really unwieldly (expensive!) from a design standpoint.  The
amount of memory you can practically put on the main NPU die is on
the order of 20-200 **mega** bytes, where a single stack of HBM
memory comes in at 4GB -- it's literally 100x the size.


I'm afraid you imply too much buffer bloat only to cause
unnecessary and unpleasant delay.

With 99% load M/M/1, 500 packets (750kB for 1500B MTU) of
buffer is enough to make packet drop probability less than
1%. With 98% load, the probability is 0.0041%.

But, there are so many router engineers who think, with
bloated buffer, packet drop probability can be zero, which
is wrong.

For example,


https://www.broadcom.com/products/ethernet-connectivity/switching/stratadnx/bcm88690
Jericho2 delivers a complete set of advanced features for
the most demanding carrier, campus and cloud environments.
The device supports low power, high bandwidth HBM packet
memory offering up to 160X more traffic buffering compared
with on-chip memory, enabling zero-packet-loss in heavily
congested networks.

    Masataka Ohta


Re: 400G forwarding - how does it work?

2022-07-27 Thread Masataka Ohta

James Bensley wrote:


The BCM16K documentation suggests that it uses TCAM for exact
matching (e.g.,for ACLs) in something called the "Database Array"
(with 2M 40b entries?), and SRAM for LPM (e.g., IP lookups) in
something called the "User Data Array" (with 16M 32b entries?).


Which documentation?

According to:

https://docs.broadcom.com/docs/16000-DS1-PUB

figure 1 and related explanations:

Database records 40b: 2048k/1024k.
Table width configurable as 80/160/320/480/640 bits.
User Data Array for associated data, width configurable as
32/64/128/256 bits.

means that header extracted by 88690 is analyzed by 16K finally
resulting in 40b (a lot shorter than IPv6 addresses, still may be
enough for IPv6 backbone to identify sites) information by "database"
lookup, which is, obviously by CAM because 40b is painful for
SRAM, converted to "32/64/128/256 bits data".


1 second / 164473684 packets = 1 packet every 6.08 nanoseconds, which
is within the access time of TCAM and SRAM


As high speed TCAM and SRAM should be pipelined, cycle time, which
matters, is shorter than access time.

Finally, it should be pointed out that most, if not all, performance
figures such as MIPS and Flops are merely guaranteed not to be exceeded.

In this case, if so deep packet inspections by lengthy header for some
complicated routing schemes or to satisfy NSA requirements are required,
communication speed between 88690 and 16K will be the limitation factor
for PPS resulting in a lot less than maximum possible PPS.

    Masataka Ohta


Re: Upstream bandwidth usage

2022-06-10 Thread Masataka Ohta

Michael Thomas wrote:

If it's so tiny, why shape it aggressively? Why shouldn't I be able to 
burst to whatever is available at the moment? I would think most users 
would be happy with that.


Seemingly, to distinguish inexpensive economy and expensive
business class services.

Masataka Ohta


Re: [EXTERNAL] FCC proposes higher speed goals (100/20 Mbps) for USF providers

2022-06-08 Thread Masataka Ohta

David Conrad wrote:


I'm with Jason. If even a small percentage of the "representative use
cases" that came out of the ITU's Network 2030 Focus Group or other
similar efforts comes to pass, bandwidth demand will continue to
grow.


As Moore's law has ended, it means users must pay a lot, which
is favorable to telcos consisting ITU.

    Masataka Ohta


Re: FCC proposes higher speed goals (100/20 Mbps) for USF providers

2022-06-06 Thread Masataka Ohta

Dave Taht wrote:

"New Zealand is approximately 268,838 sq km, while United States is 
approximately 9,833,517 sq km, making United States 3,558% larger

than New Zealand. Meanwhile, the population of New Zealand is ~4.9
million people (327.7 million more people live in United States)."


That NZ has less population density than US means the last mile
problem is more severe in NZ than US, though actual severity
depends on detailed population distribution.

    Masataka Ohta


Re: FCC proposes higher speed goals (100/20 Mbps) for USF providers

2022-06-06 Thread Masataka Ohta

Dave Taht wrote:


Looking back 10 years, I was saying the same things, only then I felt
it was 25Mbit circa mike belshe's paper. So real bandwidth
requirements only doubling every decade might be a new equation to
think about...


Required resolution of pictures is bounded by resolution of our
eyes, which is fixed.

For TVs at homes, IMHO, baseband 2k should be enough, quality of
which may be better than highly compressed 4k.

Masataka Ohta


Re: FCC proposes higher speed goals (100/20 Mbps) for USF providers

2022-06-03 Thread Masataka Ohta

Livingood, Jason via NANOG wrote:


That shows up as increased user demand (usage), which means that the
CAGR will rise and get factored into future year projections.


You should recognize that Moore's law has ended.

Masataka Ohta


Re: FCC proposes higher speed goals (100/20 Mbps) for USF providers

2022-06-03 Thread Masataka Ohta

Owen DeLong wrote:


USF is great for rural, but it has turned medium density and suburban areas 
into connectivity wastelands.

Carrier & cable lobbying organizations say that free market competition by 
multiple providers provide adequate service in those areas.


That's simply untrue, because of natural regional monopoly.


Lobbyists lie? Say it isn’t so.

You seem somehow surprised by this.


No, not at all. So?

Masataka Ohta


Re: FCC proposes higher speed goals (100/20 Mbps) for USF providers

2022-06-02 Thread Masataka Ohta

Sean Donelan wrote:

USF is great for rural, but it has turned medium density and suburban 
areas into connectivity wastelands.


Carrier & cable lobbying organizations say that free market competition 
by multiple providers provide adequate service in those areas.


That's simply untrue, because of natural regional monopoly.

Competitive providers must invest same amount of money to cover
a certain area by their cables but their revenues are proportional
to their local market shares, which means only the provider with
the largest share can survive.

In urban areas where local backbone costs, which are proportional
to market shares, exceeds cabling costs, there may be some
competitions. But, the natural regional monopoly is still
possible.

Still, providers relying on older technologies will be
competitively replaced by other providers using newer
technologies, which is why DSL providers have been
disappearing and cable providers will disappear.

In a long run, only fiber providers will survive.

The problem, then, is that, with PON, there is no local
competition even if fibers are unbundled, because,
providers with smaller share can find smaller number
of subscribers around PON splitters, as, usually,
fiber cost between the splitters and stations are
same, which is why fiber providers prefer PON over SS.

But, such preference is deadly for rural areas where
only one or two homes exist around PON splitters,
in which case, SS is less costly.

Masataka Ohta


Re: Question re prevention of enumeration with DNSSEC (NSEC3, etc.)

2022-05-12 Thread Masataka Ohta

John McCormac wrote:


There are various ways, such as crawling the web, to enumerate
domain names.



That is not an efficient method.


Not a problem for large companies or botnet. So, only
small legal players suffer from hiding zone information.


For example, large companies such as google can obtain enumerated
list of all the current most active domains in the world, which
can, then, be used to access whois.


What Google might obtain would be a list of domain names with websites. 
The problem is that the web usage rate for TLDs varies with some ccTLDs 
seeing a web usage rate of over 40% (40% of domain names having 
developed websites) but some of the new gTLDs have web usage rates below 
10%. Some of the ccTLDs have high web usage rates.


You misunderstand my statement. Domain names not offering
HTTP service can also be collected by web crawling.


Hiding DNS zone information from public is beneficial to powerful
entities such as google.


In some respects, yes.


Google can also use gmail to collect domain names used by
sent or received e-mails.

But there is a problem with that because of all 
the FUD about websites linking to "bad" websites that had been pushed in 
the media a few years ago.


Is your concern privacy of "bad" websites?


Another factor that is often missed is the renewal rate of domain names.


That's not a problem related to enumeration of domain names.

A lot of personal data 
such as e-mail addresses, phone numbers and even postal addresses have 
been removed from gTLD records because of the fear of GDPR.


As I have been saying, the problem, *if+ *any*, is whois. So?

The zones change. New domain names are registered and domain names are 
deleted. For many TLDs, the old WHOIS model of registrant name, e-mail 
and phone number no longer exists. And there are also WHOIS privacy 
services which have obscured ownership.


As I wrote:

: Moreover, because making ownership information of lands and
: domain names publicly available promotes public well fair
: and domain name owners approve publication of such
: information in advance, there shouldn't be any concern
: of privacy breach forbidden by local law of DE.

that is not a healthy movement.

    Masataka Ohta


Re: Question re prevention of enumeration with DNSSEC (NSEC3, etc.)

2022-05-11 Thread Masataka Ohta

As I wrote:


But some spam actors
deliberately compared zone file editions to single out additions, and
then harass the owners of newly registered domains, both by e-mail and
phone.


If that is a serious concern, stop whois.


There are various ways, such as crawling the web, to enumerate
domain names.

For example, large companies such as google can obtain enumerated
list of all the current most active domains in the world, which
can, then, be used to access whois.

Hiding DNS zone information from public is beneficial to powerful
entities such as google.

As such


A wrench can be a tool or a weapon, depending on how one uses it.


The wrench is whois.


However, something like trust banks may be able to hide
privacy of domain name owners if such entities can be regulated
properly for people who want some privacy.

     Masataka Ohta


Re: Question re prevention of enumeration with DNSSEC (NSEC3, etc.)

2022-05-10 Thread Masataka Ohta

Rubens Kuhl wrote:


But some spam actors
deliberately compared zone file editions to single out additions, and
then harass the owners of newly registered domains, both by e-mail and
phone.


If that is a serious concern, stop whois.


A wrench can be a tool or a weapon, depending on how one uses it.


The wrench is whois.

Masataka Ohta


Re: Question re prevention of enumeration with DNSSEC (NSEC3, etc.)

2022-05-09 Thread Masataka Ohta

Rubens Kuhl wrote:


Is there any case law where someone has asserted a database right for a DNS 
zone?


German law has something to goes somewhat near it, although closer to
a mandate rather than a right:
https://www.denic.de/en/faqs/faqs-for-domain-holders/#code-154


Similar regulation also exists in Japan. However...

Considering that, with a detailed map of a town, one can enumerate
addresses of all the houses in the town and owner information
of the houses can be obtained from land registry office operated
by government (I know complications in US on such registry), such
regulation is not very meaningful.

As privacy breach is caused by not enumeration but registry,
there is little, if any, reason to avoid enumeration.

Moreover, because making ownership information of lands and
domain names publicly available promotes public well fair
and domain name owners approve publication of such
information in advance, there shouldn't be any concern
of privacy breach forbidden by local law of DE.

Masataka Ohta


Re: Court orders for blocking of streaming services

2022-05-09 Thread Masataka Ohta

Philip Loenneker wrote:


I have a tongue-in-cheek question... if the documentation provided by
the plaintiff to the court, and/or the court documentation including
the final ruling, includes the specific URLs to the websites to
block, does that constitute transmitting links to illegal content?


Doing something authorized by law in a way specified by the
law can not be illegal. So?

Masataka Ohta


Re: Court orders for blocking of streaming services

2022-05-08 Thread Masataka Ohta

Mel Beckman wrote:

You are confusing "illegal" and "guilty".

The first party publicly transmitting illegal contents
or links to the contents are guilty, which means the
links themselves are illegal.

But, DMCA makes some third party providers providing
illegal contents or illegal links guilty only if some
condition of DMCA is met.

Same for civil liability.

You're incorrect about the DMCA when you say "DMCA treats 'linking' 
to illegal contents as illegal as the contents themselves". 


See above.


You > must knowingly link to works that clearly infringe somebody's
copyright.


Same is true if you are transmitting not links but the contents
themselves.

> A link to the Israel.TV websites themselves is not to a specific
> work, so it's not covered by DMCA. So first, as long as you don't
> know that a work is infringing someone's copyright,

You totally miss the point of the order, though I wrote:

: As the order is to those "having actual knowledge of this Default
: Judgment and Permanent Injunction Order",

    Masataka Ohta


Re: Court orders for blocking of streaming services

2022-05-08 Thread Masataka Ohta

Mel Beckman wrote:


But the phrase "or linking to the domain" Includes hundreds, possibly
thousands, of unwitting certain parties:


DMCA treats "linking" to illegal contents as illegal as the
contents themselves, which is why I wrote:

: In addition, it seems to me that name server operators "having
: actual knowledge" that some domain names are used for copyright
: infringements are not be protected by DMCA.

> I think I am simply right.

So, you know nothing about DMCA. Read it.

> The lawsuit is contradictory and overreaching.

As for transit ISPs enjoying a safe harbor of DMCA, yes, as I
already said so.

    Masataka Ohta


Re: Court orders for blocking of streaming services

2022-05-08 Thread Masataka Ohta

Mel Beckman wrote:


The plaintiff’s won a default judgement, because the defendants
didn’t show up in court. But they could not have shown up in court,
because they were only listed as "John Does" in the lawsuit. Thus no
defendant could have "actual knowledge" that they were sued,


As the defendants are those identified as "d/b/a Israel.tv, as
the owners and operators of the website, service and/or
applications (the “Website”) located at or linking to the
domain www.Israel.TV;", you are simply wrong.

> For the court to then
> approve sanctions against innocent non-parties to the suit is a
> logical contradiction.

Wrong.

Those knowingly actively cooperating with the defendants are not
innocent at all though DMCA makes some passive cooperation
innocent.

    Masataka Ohta


Re: Court orders for blocking of streaming services

2022-05-08 Thread Masataka Ohta

John Levine wrote:


I agree that the rest of the language demanding that every ISP,
hosting provider, credit union, bank, and presumably nail salon and
coin laundry in the US stop serving the defendants is nuts.


As the order is to those "having actual knowledge of this Default
Judgment and Permanent Injunction Order", according to DMCA, that
should be a reasonable order for hosting providers of illegal
contents but not for transit ISPs.

In addition, it seems to me that name server operators "having
actual knowledge" that some domain names are used for copyright
infringements are not be protected by DMCA.

    Masataka Ohta


Re: how networking happens in Hawaii

2022-05-01 Thread Masataka Ohta

William Herrin wrote:


Countries whose law derives from English Common law have a concept of
adverse possession. Details vary but mainly if you can hold the land
for 20 years against the owner's wishes then it's your land.
Conceptually it applies to nations just as surely as individuals.


Such interpretation of English Common law is against Zionism
promoted by British government and should be wrong.

Masataka Ohta


Re: Any sign of supply chain returning to normal?

2022-04-24 Thread Masataka Ohta

Randy Bush wrote:


i suspect that, in years of overabundant late stage capitalism, folk
went nuts.  and we are now paying for it.  one of my fave quotes

 I thought of it in a slightly different way--like a space that we
 were exploring and, in the early days, we figured out this
 consistent path through the space: IP, TCP, and so on.  What's been
 happening over the last few years is that the IETF is filling the
 rest of the space with every alternative approach, not necessarily
 any better.  Every possible alternative is now being written down.
 And it's not useful.  -- Jon Postel


And Steve Deering agreed with Jon saying "Exactly".

That's so funny because the statement was published in Oct. 1998
and the first rfc on IPv6 was published in Dec. 1995.

    Masataka Ohta


Re: V4 via V6 and IGP routing protocols

2022-04-04 Thread Masataka Ohta

Mark Tinka wrote:


MPLS with nested labels, which is claimed to scale because
nesting represents route hierarchy, just does not scale because
source hosts are required to provide nested labels, which
means the source hosts have the current most routing table at
destinations, which requires flat routing without hierarchy or on
demand, that is, flow driven, look up of detailed routing tables
of destinations at a distance.


This detail is limited to PE devices (ingress/egress).


As it requires

>> flat routing without hierarchy or on
>> demand, that is, flow driven, look up of detailed routing tables
>> of destinations at a distance.

MPLS is just broken.

You don't need to 
carry a BGP table in the P devices (core), as only label swapping is 
required.


So?


Fair point, it is a little heavy for an edge box,


Requiring

>> flat routing without hierarchy

means it is fatally heavy for intermediate boxes.

>> or on
>> demand, that is, flow driven, look up of detailed routing tables
>> of destinations at a distance.

means it is fatally heavy for edge boxes.

> In the end, having a flat L2 domain was just simpler.

That's totally against the CATENET model. Why, do you think,
NHRP was abandoned?

> we've never ran into an issue carrying
> thousands of IS-IS IPv4/IPv6 routes this way.

Thousands of? Today with so powerful CPUs, that is a small
network. So?

    Masataka Ohta



Re: V4 via V6 and IGP routing protocols

2022-04-04 Thread Masataka Ohta

Dave Taht wrote:


Are MPLS or SR too heavy a bat?


MPLS was not an option at the time. It might become one.


MPLS with nested labels, which is claimed to scale because
nesting represents route hierarchy, just does not scale because
source hosts are required to provide nested labels, which
means the source hosts have the current most routing table at
destinations, which requires flat routing without hierarchy or on
demand, that is, flow driven, look up of detailed routing tables
of destinations at a distance.

Masataka Ohta


Re: V4 via V6 and IGP routing protocols

2022-04-04 Thread Masataka Ohta

Pascal Thubert (pthubert) wrote:


Hello Ohta-san


Hi,


it is hopeless.


If you look at it, LS - as OSPF and ISIS use it - 


My team developed our own.

Hierarchical QoS Link Information Protocol (HQLIP)
https://datatracker.ietf.org/doc/draft-ohta-ric-hqlip/

which support 256 levels of hierarchy with hierarchical
thinning of link information, including available QoS.


depends on the
fact that all nodes get the same information and react the same way.
Isn't that hopeless too?


If you insist on OSPF or ISIS, yes.


Clearly, the above limits LS applicability to stable links and
topologies, and powered devices. This is discussed at length in
https://datatracker.ietf.org/doc/html/draft-ietf-roll-protocols-survey.
OLSRv2 pushes the model to its limit, don't drive it any faster.


You don't have to say "low power" to notice OSPF not so good.

With just a quick look at OSPF, I noticed OSPF effectively
using link local reliable multicast hopeless (as a basis
to construct hierarchical QoS routing system).

Worse, minimum hello interval of OSPF is too long for quick
recovery (low power is not required, for example, at backbone),
which is why additional complication to have an optical
layer were considered useful.


RIFT (https://datatracker.ietf.org/doc/draft-ietf-rift-rift/) shows
that evolution outside that box is possible.

OK. RIFT is "for Clos and fat-tree network topologies" of data
centers.

> RIFT develops
> anisotropic routing concepts (arguably from RPL) and couples DV and
> LS to get the best of both worlds.

It usually results in the worst of both, I'm afraid.


But none of the above allow an source router to decide once and for
all what it will get.


As there are not so many alternative routes with Clos and
fat-tree network topologies of data centers, pure source
routing combined with some transport protocol to
simultaneously try multiple routes should be the best
solution, IMO, because avoiding link saturation is an
important goal.


When you drive and the street is blocked, you can U-turn around the
block and rapidly restore the shortest path. The protocols above will
not do that; this is why technologies such as LFA were needed on top.
But then the redundancy is an add-on as opposed to a native feature
of the protocol.


What if network is not very large and minimum hello interval of
OSPF is 1ms?


Thinking outside that box would then mean: - To your end-to-end
principle point, let the source decide the packet treatment
(including path) based on packet needs


To apply the E2E argument for LS routing, all the routers
are *dumb* intermediate systems to quickly flood LS. At
the same time, all the routers are ends to initiate
flooding of local LS, to receive flooded LS and to
compute the best route to destinations in a way
consistent with other routers because they share same
flooded LS except during short transition periods.

    Masataka Ohta


Re: V4 via V6 and IGP routing protocols

2022-04-03 Thread Masataka Ohta

Dave Taht wrote:


Periodically I still do some work on routing protocols. 12? years ago I had kind
of given up on ospf and isis, and picked the babel protocol as an IGP
for meshy networks because I felt link-state had gone as far as it
could and somehow unifying BGP DV with an IGP that was also DV
(distance vector) seemed like a path forward.


As DV depends other routers to choose the best path from
several candidates updated asynchronously, which means
it is against the E2E principle and decisions by other
routers are delayed a lot to wait all the candidates
are updated, it is hopeless.

OTOH, LS only allows routers distribute the current most link
states instantaneously and let end systems of individual
routers compute the best path, LS converges quickly.

BGP is DV because there is no way to describe policies of
various domains and, even if it were possible, most, if
not all, domains do not want to publish their policies
in full detail.


My question for this list is basically, has anyone noticed or fiddled
with babel?


No.

Masataka Ohta


Re: V6 still not supported

2022-04-03 Thread Masataka Ohta

Matthew Petach wrote:


Hi Masataka,


Hi,


One quick question.  If every host is granted a range of public port
numbers on the static stateful NAT device, what happens when
two customers need access to the same port number?


I mean static outgoing port number, but your concern
should be well known incoming port number, which is
an issue not specific to "static stateful" NAT.

Because there's no way in a DNS NS entry to specify a
port number, if I need to run a DNS server behind this
static NAT, I *have* to be given port 53 in my range;
there's no other way to make DNS work.


And SMTP, as is explained in draft-ohta-e2e-nat-00:

   A server port number different from well known ones may be specified
   through mechanisms to specify an address of the server, which is the
   case of URLs. However, port numbers for DNS and SMTP are, in general,
   implicitly assumed by DNS and are not changeable.


   Or, a NAT gateway may receive packets to certain ports and behave as
   an application gateway to end hosts, if request messages to the
   server contains information, such as domain names, which is the case
   with DNS, SMTP and HTTP, to demultiplex the request messages to end
   hosts.  However, for an ISP operating the NAT gateway, it may be
   easier to operate independent servers at default port for DNS, SMTP,
   HTTP and other applications for their customers than operating
   application relays.

Though the draft is for E2ENAT, situation is same
for any kind of NAT.


This means
that if I have two customers that each need to run a
DNS server, I have to put them on separate static
NAT boxes--because they can't both get access to
port 53.


See above for other possibilities.


This limits the effectiveness of a stateful static NAT
box


For incoming port, static stateful NAT is no worse than
dynamic NAT. Both may be configured to map certain
incoming ports to certain local ports and addresses
statically or dynamically with, say, UPnP.

The point of static stateful NAT is for outgoing port
that it does not require logging.


tl;dr -- "if only we'd thought of putting a port number field
in the NS records in DNS back in 1983..."


And, MX.

As named has "-p" option, I think some people were already
aware of uselessness of the option in 1983. But, putting
a port number field at that time is overkill.

    Masataka Ohta


Re: V6 still not supported

2022-04-01 Thread Masataka Ohta

Pascal Thubert (pthubert) via NANOG wrote:


- Stateful NATs the size of the Internet not doable,


Stateful NATs are necessary only near leaf edges of ISPs
for hundreds of customers or, may be, a little more
than that and is doable.

If you make the stateful NATs static, that is, each
private address has a statically configured range of
public port numbers, it is extremely easy because no
logging is necessary for police grade audit trail
opacity.

Masataka Ohta


  1   2   3   4   5   6   7   >