svn commit: r154725 [2/2] - in incubator/directory/asn1/branches/rewrite: ber/xdocs/ codec/xdocs/ stub-compiler/xdocs/

akarasulu 21 Feb 2005 21:44:53 -0000

Added: incubator/directory/asn1/branches/rewrite/ber/xdocs/eureka.xml
URL: 
http://svn.apache.org/viewcvs/incubator/directory/asn1/branches/rewrite/ber/xdocs/eureka.xml?view=auto&rev=154725
==============================================================================
--- incubator/directory/asn1/branches/rewrite/ber/xdocs/eureka.xml (added)
+++ incubator/directory/asn1/branches/rewrite/ber/xdocs/eureka.xml Mon Feb 21 
13:44:46 2005
@@ -0,0 +1,929 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<document>
+  <properties>
+    <author email="[EMAIL PROTECTED]">Alex Karasulu</author>
+    <title>Eureka!</title>
+  </properties> 
+  
+ <body>
+
+  <section name="And we saw the light ... ">
+     <p>
+       One day Wes and Alex started talking about going to town on a new 
+       ASN.1 BER Library and here's what happened ...
+     </p>    
+     <subsection name="The Conversation">
+      <source>
+[SNIP]
+
+Wes says:
+I've been thinking about the decoding process a bit over the weekend.
+
+Alex Karasulu says:
+k I'm listening
+
+Wes says:
+and encoding.
+
+Wes says:
+I'm not sure at the initial stage there will be *one* decoder.
+
+Wes says:
+We will need some place to hold our TLV tree.
+
+Wes says:
+and also, I was thinking about really long messages.
+
+Alex Karasulu says:
+you need multiple codecs (coder decoders)
+
+Alex Karasulu says:
+right
+
+[SNIP]
+
+Wes says:
+We got one part that builds the tree
+
+Wes says:
+part two should be the translation.
+
+[SNIP]
+
+Wes says:
+I think the only issue we have is how to handle chunking, and blocking versus 
+non-blocking code.
+
+Wes says:
+And also, dealing with really huge messages.
+
+Wes says:
+It obviously won't make sense to build a TLV tree in its entirety for a huge 
+search result.
+
+Alex Karasulu says:
+right I agree
+
+Alex Karasulu says:
+for encoding there is a mechanism for breaking down large TLVs of simple types 
+down 
+
+Wes says:
+encoding is a non issue as far as chunking goes.
+
+Alex Karasulu says:
+basically in the book they talk about 3 ways of specifying length
+
+Alex Karasulu says:
+the L part
+
+Alex Karasulu says:
+right but it effects decoding
+
+Alex Karasulu says:
+but if another provider is doing encoding I see what you mean
+
+Alex Karasulu says:
+basically we can break stuff down by injecting the 3rd indeterminate length 
form
+
+Alex Karasulu says:
+follow me out
+
+Wes says:
+You give the encoder an output interface, and every time it fills up the byte 
+bufffer, it spits it out.
+
+Alex Karasulu says:
+Strictly talking about decoding and chunk sizes for now.
+
+Wes says:
+K.  decoding then.
+
+Alex Karasulu says:
+just for background - you read the section on the 3 different modes for 
+specifying length right: short, long and indeterminant?
+
+[SNIP]
+
+Alex Karasulu says:
+Your reading and encounter a really big simple type using the long encoding 
for 
+L.  So you know what you have to read is a hugh blob of data in one big hunk.  
+Basically there is some threashold u use to judge whether or not the blob is 
too
+big and needs to be chopped up.
+
+Wes says:
+I did read the section the length.
+
+Alex Karasulu says:
+cool
+
+Wes says:
+I actually printed the whole appendix out and read it.
+
+Wes says:
+on BER.
+
+Alex Karasulu says:
+cool that's what I was reffering to
+
+Wes says:
+An encoder can choose any one he wants.
+
+Alex Karasulu says:
+Now your decoder can break down the long format into the indeterminate format 
+nesting smaller TLVs inside the TLV.  Hence converting the simple TLV into a 
+constructed one.
+
+Alex Karasulu says:
+The key here is not to keep all the tlvs in memory or the entire encoded 
buffer 
+in memory
+
+Wes says:
+For decoding, there are messages where keeping the intermediate form in memory 
+is not an issue, and with others, there are.
+
+Wes says:
+issues.
+
+Alex Karasulu says:
+Right depends on the message size
+
+Wes says:
+The client will want to process most of the messages as a complete object.
+
+Wes says:
+By definition, it will be in memory.
+
+Alex Karasulu says:
+Yeah I know what you are saying.  We need to make the library not do this 
+though.  Then there would be more than one copy in memory.  Leave it upto the 
+user to determine how the data is dealt with.  Eventually we can take messures 
+to stream data if we want instead of having it all in memory.
+
+Wes says:
+Back up just a second.
+
+Alex Karasulu says:
+There are funky tactics we can employ way down the road - but for the time 
+being lets make it so our codecs dont need massive footprints 
+
+Alex Karasulu says:
+sure talk to me
+
+Wes says:
+I used this technique in a Btrieve interface I wrote for U. S. South...
+
+Wes says:
+which I stole from OpenTDS.
+
+Alex Karasulu says:
+Btrieve?
+
+Wes says:
+Yea, an ISAM database.
+
+Alex Karasulu says:
+oh ok
+
+Wes says:
+It used byte buffers to send and retrieve records.
+
+Wes says:
+I wrote a java class that basically treated the byte array as primitives.
+
+Alex Karasulu says:
+cool so you're already of the mindset to keeping the decoding and encoding 
+memory footprints small
+
+Wes says:
+That might not work with us though.
+
+Wes says:
+It might.
+
+Wes says:
+All we need to know
+
+Wes says:
+is that this field goes with this TLV.
+
+Wes says:
+and convert it on the fly.
+
+Wes says:
+Also, we an simply dump the TLVs when we are done.
+
+Alex Karasulu says:
+yeah that's part of some tables we may need to maintain with a mappiung
+
+Alex Karasulu says:
+right I think we're on the same page
+
+Alex Karasulu says:
+I have a small idea though
+
+Alex Karasulu says:
+Basically wrt the codec's interfaces
+
+Alex Karasulu says:
+To me you give an array of bytes in a byte[] or a ByteBuffer (this is the 
+delivered partial chunk) and you get back a set of TLVs for that chunk.
+
+Alex Karasulu says:
+or take it in the opposite direction for a encoder
+
+Alex Karasulu says:
+this is your stage 1 (BER bytes ->TLVs)
+
+Alex Karasulu says:
+now we need to find a way to represent TLVs in a linear fashion and still 
+maintain the tree structure.  However we don't want direct back references 
+to where the list of TLVs plug into the entire tree because this would mean 
+we have to have the whole tree in memory.
+
+Alex Karasulu says:
+does that make sense I know its a lil nebulous
+
+Wes says:
+Keep it simple  
+
+Alex Karasulu says:
+ok in decoding bytes go in and TLVs come out
+
+Wes says:
+Right.
+
+Alex Karasulu says:
+state is maintained between times u pump in bytes
+
+Alex Karasulu says:
+wit me?
+
+Wes says:
+Yup.
+
+Alex Karasulu says:
+now the TLVs comming out are a peice of the TLV tree
+
+Wes says:
+You got to be able to handle partial Ts, Ls, and Vs.
+
+Alex Karasulu says:
+right that's part of the state stuff
+
+Alex Karasulu says:
+if you're stuck in the middle of a simple tlv then you don't pump it out until 
+the chunks to complete it have arrived
+
+Alex Karasulu says:
+wit me?
+
+Wes says:
+right.
+
+Alex Karasulu says:
+So the key here is to have the right TLV represntation or data structure.  We 
+have some requirements on this.
+
+Alex Karasulu says:
+the TLVs that come out of the decoder cannot directly, with java references, 
+refer to other TLVs  that came out before.  Because these references would 
+require the entire TLV tree in memory.
+
+Alex Karasulu says:
+This is one of those requirements you agree?
+
+Wes says:
+I don't see that being an issue.
+
+Wes says:
+The parent needs to know about the children, but not vis a versa.
+
+Alex Karasulu says:
+right
+
+Wes says:
+and I don't see how you are going to be able to assemble an ASN.1 message in a 
+state driven fashion without making it very complicated.
+
+Alex Karasulu says:
+that's our primary issue here
+
+Wes says:
+and have two decoders hooked together as well.
+
+Alex Karasulu says:
+its a big problem to overcome
+
+Alex Karasulu says:
+and do it elegantly
+
+Alex Karasulu says:
+If we do this then our BER ASN.1 codec will be hot working in a non-blocking 
+fashion and being very efficient.  It's like the way SAX is used for reading 
+XML for our ASN.1 messages instead of using DOM.
+
+Alex Karasulu says:
+the ideas are similar
+
+Alex Karasulu says:
+you didn't think this was gonna be a cake walk did ya  
+
+Wes says:
+Hmmmm.
+
+Alex Karasulu says:
+you do understand where I was coming from wit the sax and dom stuff right?
+
+Wes says:
+yea.
+
+Wes says:
+That I understand.
+
+Alex Karasulu says:
+do you think its possible?
+
+Wes says:
+So you have an event driven ASN.1 parser.
+
+Wes says:
+I think that's still easy.
+
+Wes says:
+However, assembling them into the messages is still complicated.
+
+Wes says:
+every ASN.1 message type would have to be derived from our parser.
+
+Wes says:
+Then a factory could create the message type based on the application type.
+
+Alex Karasulu says:
+hmmm
+
+Alex Karasulu says:
+what do you mean by: "every ASN.1 message type would have to be derived from 
+our parser.
+
+Wes says:
+You want the ASN.1 messages to be able to assemble themselves? or no.
+
+Alex Karasulu says:
+Now you're talking about using the ASN.1 specification like a DTD to drive the 
+decoding
+
+Alex Karasulu says:
+?
+
+Alex Karasulu says:
+Yep I see yes
+
+Alex Karasulu says:
+u use the ASN.1 spec or a set of classes generated by an ASN.1 spec compiler
+
+Alex Karasulu says:
+question is do we need a compiler now?
+
+Wes says:
+Right.
+
+Wes says:
+Factory returns the ASN.1 message on the application tag.
+
+Alex Karasulu says:
+right I see where your going with the design
+
+Wes says:
+the parser then passes everything to the ASN.reader interface,
+
+Wes says:
+SAX like.
+
+Alex Karasulu says:
+Hmm sounds like it should be very possible
+
+Wes says:
+of the application object.
+
+Wes says:
+who knows how to assemble himself.
+
+Alex Karasulu says:
+right
+
+Alex Karasulu says:
+This is huge
+
+Alex Karasulu says:
+I wonder if other ASN.1 tools have this sax like mechanism already in place.
+
+Wes says:
+But how do we handle ASN.1 messages which need to be streamed.
+
+Wes says:
+like a huge search result.
+
+Alex Karasulu says:
+that's not so much the issue 
+
+Alex Karasulu says:
+a large result set takes n+2 messages
+
+Alex Karasulu says:
+sorry n+1
+
+Wes says:
+You have a search result tight.
+
+Wes says:
+Tag = Applicationz
+Length = 00
+Value = Search Results
+
+Wes says:
+Now V is made up of thousands of result messages.
+
+Alex Karasulu says:
+In the LDAP protocol a search result is returned as n+1 messages.
+
+Alex Karasulu says:
+each result is an SearchEntryResponse for the 'n' and one SearchDoneResponse 
+PDU to end the resultset
+
+Alex Karasulu says:
+n+1 messages
+
+Wes says:
+Ah.
+
+Wes says:
+But are they wrapped in an application TLV?
+
+Alex Karasulu says:
+but think of a large blob of data
+
+Wes says:
+or is it just one stream of TLVs.
+
+Alex Karasulu says:
+like say some binary chunk
+
+Alex Karasulu says:
+the application TLV for each response type is in the LDAP message envelope.  
+There is a top level LDAP message type which is a TLV then the different 
+response types have you know some enumeration values to determine which 
+response type the top level envelope or application TLV represents
+
+Wes says:
+Right.
+
+Alex Karasulu says:
+but your question is valid for say a single SearchEntryResponse where one of 
+the attributes is a huge binary chunk
+
+Wes says:
+So the event firing for the top level envelope will be different than the TLVs 
+which are part of the envelope.
+
+Alex Karasulu says:
+the top level LDAPMessage envelope defined for the LDAP asn.1 will be a 
+constructred TLV
+
+Alex Karasulu says:
+event might fire for it
+
+Alex Karasulu says:
+same one every time
+
+Wes says:
+Right, but not after the entire TLV is read into memry.
+
+Wes says:
+that would defeat our SAX based parser.
+
+Alex Karasulu says:
+but its constitution will change depending on the type of message it is
+
+Alex Karasulu says:
+right
+
+Alex Karasulu says:
+exactly
+
+Wes says:
+I'm with you.
+
+Alex Karasulu says:
+you would get a start_ldap_message event
+
+Wes says:
+Actually,
+
+Wes says:
+for the envelope, you would need to hit the factory.
+
+Wes says:
+to get the appropriate LDAP message.
+
+Alex Karasulu says:
+then perhaps the message_type_event will fire to note the contained TLV that 
+specifies the LDAP application's message type.
+
+Alex Karasulu says:
+et. cetera. see where i'm going with it - you don't need the entire message to 
+fire its arrival.  Like sax where you say start tag for this element then the 
+contained elemenets then close tags etc.
+
+Wes says:
+Got ya.
+
+Wes says:
+I think that's pretty cool.
+
+Alex Karasulu says:
+I think we're getting somewhere cool here I'm very excited.  I need to take 
+another look at a sax implementation again out there.  It will give me some 
+insight into some possible general architecture for us.
+
+Alex Karasulu says:
+Now going back to the massive chunk of binary.  So we have a 
+SearchEntryResponse with an entry of the result set containing an attribute 
+that is a huge binary chunk.  How do we stream it out right?  Then we can 
+talk about how we stream it in.
+
+Alex Karasulu says:
+Streaming it out is easy.  Let's for a moment presume that we can actually 
+stream out of the jdbm stuff.  You basically convert the long known length BER 
+encoding to the indeterminant encoding.  Then send out individual chunks of 
+this binary attribute in separate TLVs.  So you're turning big assed primitive 
+TLVs into constructed TLVs chunking out the content hence not needing the 
+entire V in memor
+
+Alex Karasulu says:
+y.
+
+Wes says:
+That's fine for us.  We have control over the encoding.
+
+Wes says:
+We won't be so lucky on the inbound side.
+
+Alex Karasulu says:
+Right
+
+Alex Karasulu says:
+Now let's think about that beast.
+
+Alex Karasulu says:
+We have a binary -> tlv encoder spitting out tlvs with each bit of input
+
+Alex Karasulu says:
+meant decoder above sorry
+
+Alex Karasulu says:
+now if the indeterminate length is used by the client when encoding and 
+sending to the server the server is ok the data is already chopped up and 
+its all good.  If not and the long length encoding is used then the data 
+comes into the server's decoder in chunks but the decoder sees a hugh 
+long length. 
+
+Alex Karasulu says:
+Based on some threshold the decoder translates the incoming long length and 
+values for the simple type (primitive TLV) into a constructed TLV breaking 
+up the large know length TLV into the indeterminant form which can be spit 
+out with a few nested TLVs at a time (with each input chunk going into the 
+decoder).
+
+Alex Karasulu says:
+You follow? Decoder automatically breaks up large primitive long length 
encoded 
+TLVs into the indeterminate form and spits those out in peices rather than the 
+one large primitive TLV.
+
+Wes says:
+What does that buy us?
+
+Alex Karasulu says:
+streaming
+
+Wes says:
+Is not the ASN message gonna re-assemble it anyways.
+
+Wes says:
+Do you still end up with 200K picture in the ASN.1 message.
+
+Alex Karasulu says:
+yeah that's application specific - remember we're talking just the BER->TLV 
+codec
+
+Alex Karasulu says:
+the other codec is Type to TLV
+
+Wes says:
+If we are using a SAX based parser, then the Type will be assembling itself as 
+the TLVs are decoded and fired.
+
+Alex Karasulu says:
+keeping it streaming means you don't have 2X the data or 400K in use just to 
+get the 200K picture
+
+Alex Karasulu says:
+right
+
+Wes says:
+At some point, you are going to have to put your faith in the garbage 
collector.
+
+Alex Karasulu says:
+right but that's not in the codec BER to TLV code
+
+Alex Karasulu says:
+keep that lean and mean - why you ask
+
+Wes says:
+Also, if you want a truly small memory footprint, then you could put stuff 
like 
+that in a small embedded database.
+
+Alex Karasulu says:
+well the TLV to Type code can be made lean and mean too
+
+Wes says:
+I just don't think at this stage that we need to be all that worried about 
huge 
+blocks of binary data.
+
+Alex Karasulu says:
+right we use referrals to data on disk to manage large peices of data that 
+needsto be streamed but this we can do later.
+
+Wes says:
+Exactly.
+
+Alex Karasulu says:
+yes but we want the options to be open - right now we can just design the 
+interfaces so all this can be added later.
+
+Alex Karasulu says:
+Interfaces and contracts should be designed to allow these very low memory 
+footprints.  Thinking through the process and what it takes to get there 
+makes us understand better what the design and interfaces should look like.
+
+Alex Karasulu says:
+I don't care if the first implementation is a hog
+
+Wes says:
+The BER stuff today doesn't deal with this.
+
+Wes says:
+It doesn't care.
+
+Alex Karasulu says:
+for large peices of data
+
+Wes says:
+It's an application issue.
+
+Alex Karasulu says:
+right
+
+Alex Karasulu says:
+what the app does with it is upto the app but lets keep the ber codecs low in 
+memory image regardless of the fact that some app will be a pig and stream the 
+data into memory anyway.  This is all that I'm trying to say.
+
+Alex Karasulu says:
+wit me?
+
+Wes says:
+K.
+
+Alex Karasulu says:
+cool we're tight on this but I think it will take more research on both our 
+parts - anyway apache is back up again after a power failure.  Here's the new 
+stuff I created for ya:   
+http://cvs.apache.org/viewcvs.cgi/incubator/directory/snickers/?root=Apache-SVN
+
+Alex Karasulu says:
+that's the top level of the snickers (snacc replacement) subproject
+
+Alex Karasulu says:
+that's all you and Jeff with the C based version of this thang
+
+Wes says:
+Right.
+
+Wes says:
+You won't find much other ASN.1 stuff out there.
+
+Wes says:
+I'm comfortable that no one is doing it this way, either.
+
+Wes says:
+It will make it unqiuely, Apache.
+
+Alex Karasulu says:
+Ok. Let's touch base in a day or two to regroup
+
+Wes says:
+Do you think ASN.1 is going to die?
+
+Alex Karasulu says:
+this is all good stuff and I'll try to get it out there.
+
+Alex Karasulu says:
+no way
+
+Alex Karasulu says:
+ASN.1 is awesome stuff
+
+Wes says:
+We'll see.
+
+Alex Karasulu says:
+SNMP is based on it and so is Kerberose
+
+Alex Karasulu says:
+what's the alternative?
+
+Wes says:
+XML is what everyone is using now.
+
+Alex Karasulu says:
+well there is XER for ASN.1 
+
+Alex Karasulu says:
+XML Encoding Rules
+
+Alex Karasulu says:
+ASN.1 can go to BER, PER, XER, and DER
+
+Wes says:
+Yes.
+
+Alex Karasulu says:
+the encoding does not effect the ASN.1 specification and that is what makes 
+ASN.1 a winner always.
+
+Wes says:
+Slapping XML on ASN.1 ain't the same.
+
+Alex Karasulu says:
+the XML format is just for the encoding of the data types 
+
+Wes says:
+I agree that ASN.1 is a good protocol.
+
+Alex Karasulu says:
+protocol specification syntax
+
+Alex Karasulu says:
+it kicks ass I think and is here to stay.
+
+Wes says:
+If we do this, we are going to go backwards right?
+
+Wes says:
+Do the compiler last.
+
+Alex Karasulu says:
+go backwards?
+
+Alex Karasulu says:
+yeah that might be the case or we can work it together.
+
+Wes says:
+You need to let me work this.
+
+Alex Karasulu says:
+I can do the compiler with you and you can handle the runtime
+
+Wes says:
+You got other things to do.
+
+Alex Karasulu says:
+ok its all you then 
+
+Alex Karasulu says:
+I'm just a follower
+
+Wes says:
+I won't mind help with the compiler.
+
+Wes says:
+Just don't get going on it any time soon  
+
+Alex Karasulu says:
+sure I have extensive javacc and antlr experience
+
+Wes says:
+Deal.
+
+Alex Karasulu says:
+hehe no worries with that my plate as you know is overflowing.
+
+Alex Karasulu says:
+my bladder too
+
+Alex Karasulu says:
+I'll catch ya later I need to hit the head
+
+Wes says:
+Talk about the decoder's stream.
+
+Wes says:
+K
+
+Alex Karasulu says:
+ttyl
+
+Wes says:
+Talk later then.
+
+Alex Karasulu says:
+ok gimme 45 seconds
+
+Alex Karasulu says:
+I'm back
+
+Alex Karasulu says:
+what about the decoder's stream.
+
+Wes says:
+So, how do we feed the decoder then.
+
+Alex Karasulu says:
+Its all about how we design our interfaces.  You know I've been looking at 
+commons-codec and see some potential but changes will be needed.
+
+Alex Karasulu says:
+Follow me for a sec.
+
+Alex Karasulu says:
+Now the codec interfaces are designed to convert stuff in one shot.  
+bytes in bytes out sort of thang.  Very blocking dependent stuff and not very 
+cool for us with a SEDA and NIO based server.  
+
+Alex Karasulu says:
+wit me?
+
+Wes says:
+right.
+
+Alex Karasulu says:
+As you might have guessed this is not good for servers that need to keep 
+memory footprints low while servicing possible serveral hundred requests 
+per second.
+
+Alex Karasulu says:
+So what do we do? We design new non-blocking and NIO based interfaces for the 
+codec API and submit them.  
+
+Alex Karasulu says:
+its down again damn
+
+Wes says:
+I got my update  
+
+Alex Karasulu says:
+cool
+
+Wes says:
+Must of brought it down.
+
+Alex Karasulu says:
+yeah maybe it will be up soon
+
+Alex Karasulu says:
+anyway
+
+Alex Karasulu says:
+We redesign these codec interfaces to manage an encoding session and a decoding
+session so chunks can be process in a stateful manner to be conducive to 
+non-blocking use.
+
+Alex Karasulu says:
+Or we use events like you said
+
+Alex Karasulu says:
+Basically we contribute this to the commons stuff and make sure the community 
+understands why and what we're doing.  That way they can double check us.
+
+Alex Karasulu says:
+Then we use those interfaces to implement the ASN.1 stuff.
+
+Wes says:
+Right.
+
+Alex Karasulu says:
+We do this in the snickers area but put back as much into the commons codec as 
+we can.  You game with this strategy?
+
+Wes says:
+Yea, that's fine.
+
+Wes says:
+I'll check out commons code as soon as it comes up.      
+
+</source>
+     </subsection>
+   </section>
+  </body>
+</document>


Added: incubator/directory/asn1/branches/rewrite/ber/xdocs/index.xml
URL: 
http://svn.apache.org/viewcvs/incubator/directory/asn1/branches/rewrite/ber/xdocs/index.xml?view=auto&rev=154725
==============================================================================
--- incubator/directory/asn1/branches/rewrite/ber/xdocs/index.xml (added)
+++ incubator/directory/asn1/branches/rewrite/ber/xdocs/index.xml Mon Feb 21 
13:44:46 2005
@@ -0,0 +1,67 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<document>
+  <properties>
+    <author email="[EMAIL PROTECTED]">Alex Karasulu</author>
+    <title>BER Runtime</title>
+  </properties>
+  <body>
+    <section name="Introduction">
+      <subsection name="What is it?">
+      <p>
+        The BER Runtime is an API for encoding and decoding ASN.1
+        data structures using Basic Encoding Rules (BER).  It implements
+        extentions to the <a href="http://jakarta.apache.org/commons/codec";>
+        commons-codec</a> API, for building stateful chunking encoder decoder
+        pairs that maintain state between processing calls.
+      </p>
+      </subsection>
+
+      <subsection name="Stateful Codecs">
+      <p>
+        More information on these new codec interfaces are availabled on the
+        <a href="../asn1-codec/index.html">stateful codec</a> home page.
+        You might want to read this before you continue since these extentions
+        are the basis to all ASN.1 encoders and decoders.
+      </p>
+      </subsection>
+
+      <subsection name="What is encoded/decoded?">
+      <p>
+        The BER runtime is protocol or ASN.1 module independent.  The unit of
+        substrate is a BER TLV (Tag, Length, Value) so any BER based protocol
+        can be decoded and encoded by the BER codec to and from TLV tuples.
+      </p>
+      </subsection>
+    </section>
+
+    <section name="BER Codec User Guides and Design Documents">
+      <table>
+        <tr>
+          <th>Subject</th>
+          <th>Description</th>
+        </tr>
+
+        <tr>
+          <td><a href="./asn1berinfo.html">ASN.1 and BER Information</a></td>
+          <td>Links to various books and specification on ASN.1 and BER</td>
+        </tr>
+
+        <tr>
+          <td><a href="./BERDecoderDesign.html">BER Decoder Design</a></td>
+          <td>Explains how and why the BERDecoder was designed</td>
+        </tr>
+
+        <tr>
+          <td><a href="./BERDigesterDesign.html">BER Digester Design</a></td>
+          <td>Explains how and why the BERDigester was designed</td>
+        </tr>
+
+        <tr>
+          <td><a href="./BEREncoderDesign.html">BER Encoder Design</a></td>
+          <td>Explains how and why the BEREncoder was designed</td>
+        </tr>
+
+      </table>
+    </section>
+  </body>
+</document>

Added: incubator/directory/asn1/branches/rewrite/codec/xdocs/index.xml
URL: 
http://svn.apache.org/viewcvs/incubator/directory/asn1/branches/rewrite/codec/xdocs/index.xml?view=auto&rev=154725
==============================================================================
--- incubator/directory/asn1/branches/rewrite/codec/xdocs/index.xml (added)
+++ incubator/directory/asn1/branches/rewrite/codec/xdocs/index.xml Mon Feb 21 
13:44:46 2005
@@ -0,0 +1,326 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<document>
+  <properties>
+    <author email="[EMAIL PROTECTED]">Alex Karasulu</author>
+    <title>Stateful Codecs</title>
+  </properties>
+  <body>
+    <section>
+    <subsection name="Introduction">
+      <p>
+        Codecs are bidirectional data transformations.  The data transformed,
+        often referred to as the substrate, may be [en]coded or decoded hence
+        the word codec.  The word codec also refers to the actual software
+        used to encode and decode data.  We use the term stateful codec for
+        lack of a better description for encoder/decoder pairs possessing
+        certain abilities and exhibiting the following behavoirs:
+      </p>
+
+      <ul>
+        <li>the ability to interrupt and resume operation without loosing
+            state</li>
+        <li>the ability to process a substrate in one or more steps operating
+            on small chunks rather than all of it in one large operation</li>
+        <li>free up resources while not actively processing perhaps until more
+            of the substrate is available, or just to multiplex limited
+            resources</li>
+        <li>use a small fixed size chunk buffer rather than a variable sized
+            buffer equal to the entire size of the substrate what ever that
+            may be</li>
+      </ul>
+    </subsection>
+
+    <subsection name="Advantages">
+      <p>
+        The abilities or behavoirs listed above make stateful codecs ideal for
+        use in resource critical situations.  Servers for example based on
+        codecs may have to perform several thousand concurrent encode/decode
+        operations.  The resources required for such operations, namely threads
+        and memory buffers will be limited.  Most of the time these operations
+        will be waiting for IO to complete so they can free up resources to
+        allow other operations to proceed.  Stateful codecs make this possible
+        and complement servers designed using non-blocking IO constructs.
+      </p>
+
+      <p>
+        Servers cannot afford to allocate variable sized buffers for arriving
+        data.  Allowing variable sized buffers based on incoming data
+        sizes opens the door for DoS attacks where malicious clients can
+        cripple or crash servers, by pumping in massive or never ending
+        data streams.  Stateful codecs enable fixed size processing overheads
+        regardless of the size of the data unit transmitted to the server.
+        Smaller codec footprints lead to smaller server process memory
+        footprints.
+      </p>
+
+      <p>
+        These advantages also make stateful codecs ideal for use in resource
+        limited environments like embedded systems, PDAs or cellular phones
+        which use ASN.1 and one of its encoding schemes to control data
+        transmission.  These systems all run on limited resources where the
+        codec's operational footprint will have dramatic effects on the
+        performance of the device.
+      </p>
+    </subsection>
+
+    <subsection name="How is a stateful codec defined?">
+      <p>
+        There are several ways to skin this cat.  To this day discussions are
+        underway at the ASF to determine the best approach.  Until a consensus
+        is reached we have decided to use an event driven approach where the
+        events are modelled as callbacks.  To better explain the approach we
+        need to discuss it within the context of encoding/decoding.
+      </p>
+
+      <p>
+        Depending on the operation being performed, available chunks of the
+        substrate are are processed using either the <code>encode()</code> or
+        the <code>decode()</code> method.  These methods hence are presumed
+        to process small chunks of the substrate.  The specific codec
+        implementation should know how to maintain state based on the encoding
+        between these calls to process a unit of substrate which likewise is
+        determined by the encoding.  So the encoding (a.k.a. codec) defines
+        what a unit of substrate is as well as any state information required
+        while peice-meal processing the substrate.  Several calls to these two
+        methods may be required to process a unit of the substrate.  When the
+        entire unit has been processed an event is fired.  Again the specific
+        codec detects the compete processing of a unit of substrate so it
+        knows when to fire this event.
+      </p>
+
+      <p>
+        Going back to our approach for defining a stateful codec, we modeled
+        the event as a callback to a specific interface.  For decoders this
+        would be a <code>DecoderCallback.decodeOccurred()</code> and for
+        encoders it would be an <code>EncoderCallback.encodeOccurred()</code>
+        method call.  These interface methods are called when an entire unit
+        of substrate is respectively decoded or encoded.
+      </p>
+
+      <p>
+        This approach also allows for codec chaining in a pipeline where
+        codecs may be stacked on top of one another.  The callback interfaces
+        are used to bridge together codecs by feeding the output of one codec
+        operation into the input of another.  Specific classes have been
+        included in the API to accomodate this usage pattern.
+      </p>
+
+      <center>
+        <img src="../images/all-uml.gif"/>
+      </center>
+
+    </subsection>
+
+    <subsection name="StatefulDecoder Usage">
+      <p>
+        StatefulDecoders use callbacks to notify the successful decode of a
+        unit of encoded substrate.  Other than this, the definition of what a
+        'unit of encoded substrate' is, depends on the codec's decoder
+        implementation.  The definition may be size constrained or be a
+        function of context.
+      </p>
+      
+      <p>
+        Basically you give a decoder some of the substrate every so often
+        as more of the substrate is made available, then when a unit of 
+        encoded substrate has been decoded, the decoder notifies those 
+        concerned by invoking the callback.  
+      </p>
+      
+      <p>
+        A demonstration of how a StatefulDecoder works is illustrated below:
+      </p>
+      
+      <source>
+StatefulDecoder decoder = new SomeConcreteDecoder( 512 ) ;
+DecoderCallback cb = new DecoderCallback() {
+  decodeOccurred( StatefulDecoder decoder, Object decoded ) {
+      // do something with the decoded object
+  }
+};
+decoder.setCallback( cb ) ;
+      </source>
+      
+      <p>
+        The StatefulDecoder uses a callback to deliver decoded objects which 
+        are the decoded 'unit of encoded substrate'.  StatefulDecoders are 
ideal
+        for use in high performance servers based on non-blocking IO.   Often
+        StatefulDecoders will be used with a Selector in a loop to detect input
+        as it is made available.  As the substrate arrives, it is be fed to
+        the decoder intermittantly.  Finally the callback delivers the decoded
+        units of encoded substrate.  Below there is a trivialized example of
+        how a StatefulDecoder can be used to decoded the substrate as it
+        arrives fragmented by the tcp/ip stack:
+      </p>
+      
+      <source>
+while ( true ) {
+  ...
+  SelectionKey key = ( SelectionKey ) list.next() ;
+  if ( key.isReadable() ) {
+    SocketChannel channel = ( SocketChannel ) l_key.channel() ;
+    channel.read( buf ) ;
+    buf.flip() ;
+    decoder.decode( buf ) ;
+  }
+  ...
+}
+      </source>
+      
+      <p>
+        As you can see from the code fragment the decode() returns nothing
+        since it has a void return type.  Because the callback is used to
+        deliver the finished product when it is ready, the decode operation
+        can occur asynchronously in another thread or stage of a server if
+        desired.
+      </p>
+    </subsection>
+    
+    <subsection name="Strengths and Weaknesses">
+      <p>
+        As can be seen from the section above and some of the characteristics 
+        of StatefulDecoders, they are ideal for building network servers.  
These
+        decoders waste very little memory per request, cannot be overloaded by
+        massive requests which may be used for DoS attacks, and they process 
the
+        substrate as it arrives in chucks instead of in one prolonged CPU and 
+        memory intensive step.
+      </p>
+      
+      <p>
+        Servers with a high degree of concurrency need to keep overheads low.
+        StatefulDecoders certainly help achieve that end by keeping the
+        active processing footprint low with a constant size regardless of the 
+        size of the substrate.
+      </p>
+      
+      <p>
+        The cost of creating a decoder for every new connection is usually
+        very minimal however we cannot forsee every possible implementation.
+        Regardless of the cost associated with dedicating a StatefulDecoder
+        to each new connection, stateful protocol servers will often benefit
+        most, as opposed to a stateless server.  The reasoning is as follows:
+        the longer the life of the connection, the more worth while it
+        is to create a StatefulDecoder and thereby have it amortize over the 
+        life of the connection.
+      </p>
+      
+      <p>
+        The primary drawback is that StatefulDecoders are much more complex to
+        implement.  They are basically state driven automata which change
+        their state with the arrival of data.  Furthermoe it is very difficult
+        for StatefulDecoders to gracefully recover from corrupt or lost input.
+      </p>
+    </subsection>
+    
+    <subsection name="StatefulDecoder Chaining/Stacking">
+      <p>
+        StatefulDecoders can easily be chained or stacked to operate on a 
+        substrate stream.  This is achieved by having the callback of one 
+        decoder feed the <code>decode(Object)</code> method of another.  Hence
+        the decoded byproduct of one decoder is the encoded substrate of 
+        another.
+      </p>
+      
+      <p>
+        Because the occurence of chaining may be common and several folks have
+        already expressed their interest in it, we have devised a special
+        StatefulDecoder implementation called a DecoderStack.  It itself is 
+        a decoder however other decoders can be pushed onto it.  When empty
+        without any decoders in the stack it operates in pass-thro mode.  The
+        decode operation is basically the identity transformation.  When
+        StatefulDecoders are pushed, decode operations invoke a chain of
+        decoders starting with the bottom most in the stack going up to the
+        top.  The final callback invoked is the callback registered with the
+        DecoderStack.
+      </p>
+      
+      <p>
+        Below is an example of how this DecoderStack is used.  The example is
+        taken from one of the JUnit test cases for DecoderStack:
+      </p>
+
+      <source>
+public void testDecode() {
+  DecoderStack stack = new DecoderStack() ;
+  CallbackHistory history = new CallbackHistory() ;
+  stack.setCallback( history ) ;
+  stack.push( decoder ) ;
+  stack.decode( new Integer(0) ) ;
+  assertEquals( new Integer(0), history.getMostRecent() ) ;
+        
+  stack.push( new IncrementingDecoder() ) ;
+  stack.decode( new Integer(0) ) ;
+  assertEquals( new Integer(1), history.getMostRecent() ) ;
+
+  stack.push( new IncrementingDecoder() ) ;
+  stack.decode( new Integer(0) ) ;
+  assertEquals( new Integer(2), history.getMostRecent() ) ;
+}
+...
+
+class IncrementingDecoder extends AbstractStatefulDecoder
+{
+  public void decode( Object encoded ) throws DecoderException
+  {
+    Integer value = ( Integer ) encoded ;
+    value = new Integer( value.intValue() + 1 ) ;
+    super.decodeOccurred( value ) ;
+  }
+}
+      </source>      
+    </subsection>
+    
+    <subsection name="Recommendations to Implementors">
+      <p>
+        Keep it simple and rely on chaining to divide and concur complex 
+        decoders into several trivial decoders.  Besides simple chaining,  
+        situations will warrent the use of a choice driven decoder.  Such a 
+        decoder chooses which subordinate decoder to use based on its
+        current state.  For example in the simple BER byte stream to TLV 
+        decoder in Snickers, their is a TagDecoder, a LengthDecoder and
+        several Value decoders that are swapped in and out when the top 
+        BERDecoder switches state or detects a new primitive datatype.
+      </p>
+      
+      <p>
+        When reading encoded data from buffers, keep in mind that there are 
+        5 different possible configurations to the contents of arriving data 
+        with respect to the unit of encoded substrate:
+      </p>
+
+      <!--
+        todo add illustrations using images here - its not that hard
+        might want to turn this into a table instead of a ul if we decide
+        to do that
+      -->
+
+      <ul>
+        <li>
+          it contains a single complete discrete unit of encoded substrate
+        </li>
+        <li>
+          it contains many discrete and complete units of encoded substrate
+        </li>
+        <li>
+          it contains a partial fragment of a unit of encoded substrate
+        </li>
+        <li>
+          it contains two partial fragments of a unit of encoded substrate with
+          the start of one and the end of another
+        </li>
+        <li>
+          it contains one or more fragments with one or more units of encoded 
+          substrate
+        </li>
+      </ul>
+      
+      <p>
+        When fragments arrive they are either head or tail fragments.  Head
+        fragments are those that start a unit and they are found at the end 
+        of the buffer.  Tail fragments end a unit of encoded substrate and are
+        found at the front of the buffer.
+      </p>
+    </subsection>
+    </section>
+  </body>
+</document>

Added: incubator/directory/asn1/branches/rewrite/stub-compiler/xdocs/index.xml
URL: 
http://svn.apache.org/viewcvs/incubator/directory/asn1/branches/rewrite/stub-compiler/xdocs/index.xml?view=auto&rev=154725
==============================================================================
--- incubator/directory/asn1/branches/rewrite/stub-compiler/xdocs/index.xml 
(added)
+++ incubator/directory/asn1/branches/rewrite/stub-compiler/xdocs/index.xml Mon 
Feb 21 13:44:46 2005
@@ -0,0 +1,14 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<document>
+  <properties>
+    <author email="[EMAIL PROTECTED]">Alex Karasulu</author>
+    <title>Snickers ASN.1 Java Stub Compiler</title>
+  </properties>
+  <body>
+    <section name="Coming soon ...">
+      <p>
+        Wonderful things are coming soon ...
+      </p>
+    </section>
+  </body>
+</document>

svn commit: r154725 [2/2] - in incubator/directory/asn1/branches/rewrite: ber/xdocs/ codec/xdocs/ stub-compiler/xdocs/

Reply via email to