Tracy,

Thank you for responding.

The problem though with "I have never run into any trouble that I am aware
of" is that that is the problem. Undetected data integrity errors are a risk
that needs to be quantified and being undetected you won't be aware of it
until it means the most to you. I would love to see a quantification of what
the risk level really is. Many network protocols have their own level of
data integrity built in to protect the data being transferred. TCP has the
rudimentary checksum which, even though it is weak, will catch some errors
and does so even today.

My comments about the network interconnect points is that they are points to
introduce errors as well. I found this paper;
http://portal.acm.org/citation.cfm?id=347561&dl=ACM&coll=DL&CFID=110304004&CFTOKEN=23505194,
which is a good read on an analysis done on captured network data over
several months where the TCP checksum failed but CRC passed.
They categorized much of the error types and found interesting problems with
network connection points (routers, switches, etc...) injecting errors on
the packet streams. CRC passes because it was applied to each link but not
while it was traversing through a switch or router (and fortunately the
error was captured by TCP checksum). I even saw an article about using the
digests on iSCSI for data integrity exactly because the data was traversing
these network connection points. iSCSI digests are simply a CRC32 being
applied to the header and/or the payload.

So that has me thinking about AoE and the fact that it doesn't even have an
optional ability to enable data integrity protection of the ATA data flow.
Is this because the deployment of solutions using AoE are very tightly
constrained to a Coraid blade rack connected directly to a large file
server, no switches or any other network connections in between? Or is it
because the deployments of AoE are not in enterprise level, mission critical
environments? The problem potentially gets more acute when you introduce
jumbo frames.

Don't get me wrong, I love the simplicity of the AoE protocol but I would
love to understand what companies like Coraid say or present to potential
customers when they ask these kinds of questions. It also would be real
simple to add an optional data integrity to AoE but before doing so I would
still like to understand the characteristics of any potential problems.

David

David

On Wed, Nov 10, 2010 at 1:12 PM, Tracy Reed <tr...@ultraviolet.org> wrote:

> On Wed, Nov 10, 2010 at 11:09:16AM -0600, David Leach spake thusly:
> > I've been looking at AoE and I'm trying to understand what affect the
> > Ethernet CRC-32 data integrity checking has on the AoE communications?
>
> I too have been wanting to better understand the error correction
> facilities of
> AoE. So far I have never run into any trouble that I am aware of.
>
> > Has anyone done any analysis or have any response to AoE's ultimate
> > reliability for missed error detection of bad frames?
>
> I suspect it is sufficiently low as to not be an issue. You have to
> multiply
> the chance of an error by the chance that the error would not be caught due
> to
> being a 32 bit crc.
>
> > I think that there is a further problem to understand and that is with
> > network connection points. AoE is not routable but that doesn't mean you
> > can't use network switches to interconnect initiators with their AoE
> targets
> > and at these switches there seems to be a possible error point introduced
> > which AoE isn't protecting against? Are there best practices for AoE
> > installations to protect against these error points?
>
> AoE is not layer 3 routable. It is routable in general with spanning tree
> etc.
> You can also implement an ethernet tunnel (being careful of MTU concerns
> etc).
> I think the genious of AoE vs iSCSI is in adhering to the separation of
> concerns of each of the layers of the network stack.
>
> --
> Tracy Reed
> http://tracyreed.org
>
------------------------------------------------------------------------------
The Next 800 Companies to Lead America's Growth: New Video Whitepaper
David G. Thomson, author of the best-selling book "Blueprint to a 
Billion" shares his insights and actions to help propel your 
business during the next growth cycle. Listen Now!
http://p.sf.net/sfu/SAP-dev2dev
_______________________________________________
Aoetools-discuss mailing list
Aoetools-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/aoetools-discuss

Reply via email to