Oh! Wow, I'm happy that the video was useful to someone :)

Tiago


----- Original Message ----
From: Jayaram Subramanian <[email protected]>
To: [email protected]
Sent: Fri, 9 April, 2010 2:56:53
Subject: Re: DRDA - An abridged tutorial

Thanks Tiago. Also special thanks to your derby installation video,
thrpugh which i installed derby in my pc

Wiy

On Thu, Apr 8, 2010 at 2:49 PM, Tiago Espinha <[email protected]> wrote:
> Dear all,
>
> Earlier today Kathey gave me a crash course on DRDA for my GSoC project. 
> Jayaram asked Kathey for a transcript of our conversation and Kathey 
> suggested that I'd send it to the list, so that other contributors could spot 
> eventual mistakes. If you find any, please feel free to chip in with your 
> knowledge.
>
> Here goes:
> ---------------------8<----------------------
> <kmarsden> DRDA - Distributed Relational Database Architecture.
> <kmarsden> Basically it is a protocol that shuttles database requests from a 
> client over the network to a server.  The call the client an Application 
> Requester and the server an Application Server.
> <kmarsden> The Application Server term predates what we think of as an 
> Application Server and has nothing to do with it.
> <etiago> ok
> <kmarsden> So as we discussed, the derby client JDBC Driver (our Application 
> Requester) converts JDBC calls into DRDA, sends the DRDA accross the network 
> to Network Server (our Application Server) which converts them back into JDBC 
> which it sends to the embedded driver.
> <kmarsden> So the whole thing is a JDBC to DRDA to JDBC converter.  This way 
> we meet the requirement of having multiple jvms on multiple machines 
> accessing a single embedded database, because everything gets routed through 
> the network server process.
> <kmarsden> Make sense?
> <etiago> it does
> <kmarsden> DRDA is mostly associated historically with DB2 but there are 
> actually a lot of licensed DRDA vendors, including Microsoft
> <etiago> so let me try to establish a comparison here
> <etiago> DRDA is sort of a platform-agnostic way of transferring database 
> requests over the network, is this right?
> <etiago> something like XML
> <kmarsden> yes. That's right.   It can go over TCP/IP or something called 
> SNA, but yes it is platform-agnostic. As you may guess I think it started on 
> the mainframe with EBCDIC encoding.
> <kmarsden> Having a standard protocol has allowed even for a single client to 
> be shared amongst several databases.
> <kmarsden> For example the IBM DB2 Universal JDBC Driver (JCC) used to be 
> used with Derby and is still used with Informix as well as of course many 
> flavors of DB2.
> <kmarsden> DRDA iis also used with the DB2 C client. Not just JDBC
> <etiago> I see
> <kmarsden> In practice it is not as portable as you would like. It is 
> generally a lot of work to get one product's DRDA client working with a 
> different server.
> <etiago> ok, is DRDA just plaintext? I was looking at the trace and I saw 
> that there's an hexadecimal, ASCII and EBCDIC representations of the data
> <kmarsden> The encoding for data can be described in the protocol flow. Up 
> until now the DDM commands and parameters (which we will discuss in a bit) 
> were all in EBCDIC.
> <etiago> ok
> <kmarsden> UNICODEMGR allows them (except for the earliest) to be in UTF-8
> <kmarsden> So there are three volumes to the manuals. Volume 1 DRDA describes 
> the protocol flow.
> <kmarsden> It shows the commands and what order they flow in from AR to AS 
> and back.
> <kmarsden> But to read it you need to look at Volume 2, which describes the 
> commands and objects that are being flowed in detail
> <kmarsden> The DDM manual shows a 2 byte "Codepoint" for each one of these 
> commands or objects and that is how they are identified.
> <etiago> ah, kind of counterintuitive then #:)
> <kmarsden> You will see Codepoint.java class in both client and server that 
> lists the codepoints in Derby
> <etiago> I did look at that and all it reminded me of was pointers
> <kmarsden> Vol 3 describes the format for the data types.
> <etiago> so DRDA also has data types of its own?
> <kmarsden> I can't say I have ever actually used Vol 3 or understood it.  
> That work was done pretty early before even my time.  I spend my time mostly 
> in Vol 2 and Vol 1
> <etiago> oh alright
> <kmarsden> Yes, it does and I think at two levels.  For example DRDA itself 
> in Volume 1 will talk about Varchar and Char types but Vol 3 just talks about 
> strings in general.  But we'll move on because this is not where you will be 
> working.
> <etiago> ok
> * kasun ([email protected]) entrou em #derby
> * kasun agora chama-se Guest50421
> <kmarsden> Ok.  All this protocol flow and DDM objects are wrapped in 
> something called a Data Stream Structure (DSS) which is just really a packet 
> wrapper and has no real content except descxribing the length and chaniing 
> one DSS to another.  You will see these in the traces starting with a D0 and 
> identifes the packet as a request a reply or an object.
> <kmarsden> I only mention that really because you will see DSS referred to 
> all over the place. I don't think you will have to change that code either.
> <etiago> yes, I've seen DSS stuff around
> <kmarsden> Now lets look at the actual protocol.  When a connection is made 
> by the AR the first thing that is sent is an EXCSAT
> <kmarsden> Exchange Server Attributes Request
> <etiago> yes, I see that on the trace
> <kmarsden> This even with UNICODEMGR will be sent EBCDIC and is really 
> important because it is where the protocol levels and now encoding are 
> negotiated.
> <etiago> as I understand, we keep this as EBCDIC to allow clients that don't 
> support UTF-8 to ask for normal EBCDIC rather than UTF-8
> <kmarsden> Right. If you look at the DDM manual which is alphabetical at 
> EXCSAT, you will find out about it.,
> <kmarsden>
> <etiago> ok
> <kmarsden> Generally with EXCSAT you send an external name identifying the 
> client, a version and a manager level list.
> <kmarsden> It is the manager level list or MGRLVLLS where the protocol 
> negotiation and now encoding takes place.
> <kmarsden> oh no, I switched computers and don't have my spec to look at so I 
> wlll go from memory.
> <kmarsden> Most of the Manager levels like SQLAM for instance are just a 
> number 1- whatever to show the protocol level, but UNICODEMGR the one being 
> introduced for ACR7007 is a bit different.
> <kmarsden> For that one the client will send 0 if it wants to continue with 
> EBCDIC or 1208 if it wants UTF-8.  I am not sure exactly why it was set up 
> this way, maybe to allow different encodings in the future? I just don't know
> <etiago> ok yeah, that sort of makes sense
> <kmarsden> The next things that flow from the AR to the AS are ACCEC and 
> SECCHK to negotiate security. These are chained before receiving the EXCSATRM 
> so these are in EBCDIC too.
> <kmarsden> SECCHK (or maybe ACCSEC) has an optional RDBNAM parameter which 
> Derby client used to send.
> <etiago> that's the database name if I recall correctly
> <kmarsden> Some time ago I took that out in preparation of DERBY-728. We now 
> do not send the RDBNAM until it is required in ACCRDB
> <kmarsden> right.
> <etiago> oh by removing it from the SECCHK (which is EBCDIC-only) we can 
> actually support database names with chinese and japanese characters
> <kmarsden> Anyway after the AR flows EXCSAT, ACCSEC and SECCHK then the 
> server sends back an EXCSATRM with it's manager level list and then the AR 
> and AS settle on the level .
> <kmarsden> right.
> <etiago> ok what exactly is this level that they settle on?
> <kmarsden> For each level, for instance UNICODEMGR, if the AR sent 1208 and 
> the AS sent back 0, they would settle on 0 all EBCDIC.
> <kmarsden> I suppose it is actually the server that takes what the client has 
> provided and returns the actual level that it will be.
> <etiago> oh ok
> <kmarsden> If the server could handle 1208 but the client said 0, it would be 
> 0.
> <kmarsden> At least I think that's how it works. You better check
> <etiago> that makes sense but I will check
> <kmarsden> Also if the client knows nothing about this new manager level the 
> server will dumb things down,
> <kmarsden> For example if a 10.3 client tries to connect to a 10.6 server 
> after DERBY-728 has been implemented, EBCDIC it will be, UNICODEMGR level 0
> <kmarsden> make sense?
> <etiago> yep we have to maintain compatibility
> <etiago> by using what's common to everyone
> <kmarsden> right
> <kmarsden> So let's say you have implemented DERBY-728 and are using new 
> client and server.   AR sent 1208, AS responded 1208 for UNICODEMGR.   It is 
> with the next request ACCRDB that the encoding changes.
> <kmarsden> It is with this request that we now finally send the RDBNAM in 
> UTF-8 so we can send Chinese. Hooray!
> <etiago> yay :-)
> <kmarsden> But there is gotcha of course.  The DRDA character strings have 
> limits and not just character limits but byte limits.
> <etiago> meaning that the arguments of ACCRDB, etc have certain byte length 
> limitations, right?
> <kmarsden> right. It is really very sad because from a Derby perspective, 
> these are just arbitrary limits and since the database name is a often a full 
> file system path and now characters can take up to four bytes I am thinking 
> folks are going to run out.
> <etiago> yeah, I suppose these limits are defined somewhere in those DRDA 
> documents?
> <kmarsden> but I understand extending or eliminating 1) would require a new 
> opengroup ACXR which seems to take years and 2) will be a total no go in the 
> C world.
> <kmarsden> right.  I think RDBNAM is 255 bytes
> <etiago> actually, about that, is this ACR actually approved?
> <kmarsden> It is not.  It is undergoing the "fast track" approval process at 
> opengroup with some other ACR's.  Work on it within IBM completed over a year 
> ago, but they were waiting for a bunch of others to be ready to submit
> <etiago> gotta love bureaucracy :-)
> <kmarsden> But I think it is pretty solid.  The work you do should be for 
> 10.7, not 10.6
> <kmarsden> so we'll be safe just in case.
> <etiago> alright
> <kmarsden> I should mention that this ACR was not originally for the purpose 
> we are using it for.  It was originally meant  some sort of performance 
> improvement to avoid converting to EBCDIC all the time.
> <kmarsden> At least 6 times that I can remember someone suggested moving the 
> switch to UTF-8 later after ACCRDB, so I have scouts at opengroup that 
> promise they will send up a big red flag if somebody tries i again.
> <etiago> hahaha
> <kmarsden> anyway moving along.
> <etiago> that's good
> <kmarsden> So what I have covered is the part of the protocol that you will 
> be dealing with mostly. I will move on with some other protocol stuff but 
> mostly for general knowledge. Then some other time we can have a talk about 
> the Derby code in relation to the protocol
> <etiago> ok sounds good
> <kmarsden> So when Network server receives the ACCRDB it makes an embedded 
> connection to the database which it will use for all the requests on that 
> client connection.  The simplest thing that might come through is an 
> EXCSQLIMM which is just a simple statement execution with no result set and 
> not a prepared statement, maybe just an update or delete.
> <etiago> yep I got that in my trace: [derby]        SEND BUFFER: EXCSQLIMM    
>           (ASCII)           (EBCDIC)
> <kmarsden> Most statements need to be prepared. Even if they are not a JDBC 
> prepared statement, if they return a result set, they have to be prepared 
> with a PRPSQLSTT
> <kmarsden> The PRPSQLSTT command takes a package name, consistency token and 
> section number.  The package name and section number are used to identify the 
> statement later when it is executed.
> <kmarsden> In DB2 these are actual things and there are packages related to 
> holdability and other things I can't remember and the section is the 
> statement within that category.
> <etiago> hmm ok
> <kmarsden> In Derby, we just sort of pretend, because we don't actually have 
> packages.  But these are how the statements are identified and Network Server 
> keeps a hash table of the prepared statements keyed on package and section 
> number for retrieval.
> <kmarsden> An interesting way to look at this is to start network server and 
> then go into ij and prepare a statement or two  then  run runtimeinfo to see 
> the statements, their packages and section numbers.
> <etiago> ok
> <kmarsden> After PRPSQLSTT is the SQLSTT which is the actual SQL statement 
> text. This I believe can already be encoded UTF-8 even without ACR7007, so 
> you can have international characters in SQL even now.
> <kmarsden> Once the statement is prepared, you execute it with ECSQLSTT.  Of 
> course when you do this, you send again the package name, consitency token 
> and section number so the server can look it up and match it to the prepared 
> statement
> <etiago> hmm but is it actually done right now in Derby?
> <kmarsden> SQLDTA contains the parameter data and may just have place holders 
> for large objects which are sent in an EXTDTA
> <etiago> ok
> <kmarsden> If the statement is a query, the AR sends an OPNQRY, and gets an 
> OPNQRYRM and then iterates with CNTQRY (sort of like next but for many rows) 
> and gets QRYDTA and possibly EXTDTA objects back with the data.
> <kmarsden> En the end the server sends a ENDQRYRM and a SQLCARD indicating 
> the end of data.
> <kmarsden> The SQLCARD is an interesting thing. Generally it is used for 
> SQLExceptions, but there is a special one for end of data.
> <etiago> so a simple SELECT statement actually represents several DRDA 
> commands, right?
> <kmarsden> right, but not nearly as many as in that book.  I still don't 
> understand why it's so big. The ones I mentioned here are pretty much the 
> pages you will go to over and over again
> <etiago> hehe fair enough :)
> <kmarsden> When the client connection is all done, there is no final end 
> connection command that it sends, which really irritating. The socket is 
> closed on the client and the server has to detect that and clean up. This has 
> lead to many leak issues over the years. Maybe someday there will be an 
> actual command which will make things easier.
> <etiago> maybe another ACR is due
> <kmarsden> So I guess that's the end of the protocol story for today, unless 
> you have questions.
> <kmarsden> I wonder too if dag is here and  if so how many lies he can spot.
> <etiago> haha I think that covers it for now, I will have questions over time 
> I'm sure
> <kmarsden> Yes, maybe you can write the ACR's for extending the character 
> string lengths and a connection termination command.  It will look nice on 
> your resume and nice for opengroup to have input from a more diverse group.
> <etiago> maybe #:)
> <etiago> maybe I still have a too simplistic insight of DRDA on Derby, but I 
> think this will all boil down to detect the level of the encoding and 
> instantiate the right encoder class based on that
> <kmarsden> I am a bit torn sometimes about whether we should just branch out 
> since we have our own ciient.
> <kmarsden> Yes, that sounds right.
> ------------------------------------8<----------------------
>
> Regards,
> Tiago Espinha
>
>
>
>
>





Reply via email to