[ http://issues.apache.org/jira/browse/DERBY-125?page=all ]

A B updated DERBY-125:
----------------------

    Attachment: offByOne.jar


I spent some time looking at the existing test case for this issue and at 
Bryan's excellent description of the problem.  I played around with both and 
was able to come up with a repro for the problem that actually causes a failure 
on JCC.

This isn't by any means a polished repro--the test would need to be cleaned up 
before it could be added to the nightlies.  But I thought I'd post what I found 
and see what the reactions are.

The repro is called offByOne.java (it's part of offByOne.jar).

The theory behind this test is based on a quote from Bryan's email in the 
following thread:

http://article.gmane.org/gmane.comp.apache.db.derby.devel/11865

The quote was:

<begin_quote>

Note that the only "random" or "nondeterministic" part of this damage is that
the very last byte in the large DDM object gets replaced by an unknown byte
from the unused section of the 'bytes' array. If the bytes array was just
expanded in order to accomodate this object, then the unused section of the
array will contain zeros, but if the bytes array was previously used for a
larger amount of data, then those unused bytes contain unknown values. 

<end_quote>

So what I've done with this test is remove the "random" part.

Basically, I take the repro that is already part of the existing patch and I 
run it twice.  The first iteration, I make both the table name and the column 
name extra long, so that the server-side buffer has more data in it.  The 
second iteration, I use simpler names for the table and column, which take up 
less space in the server buffer.  Then, per Bryan's quote, "if the bytes array 
was previously used for a larger amount of data, then those unused bytes 
contain unknown values".  But since we intentionally put the "larger amount of 
data" into the buffer during the first iteration, we know what those "unknown" 
values are going to be.  Then, by twiddling the size of the table/column names, 
we can 'shift' the data until we reach a point where the off-by-one error 
manifests itself: namely, we end up incorrectly leaving old data in the current 
server buffer.

I've also attached the traces for Derby Client and JCC with and without the 
patch for DERBY-125.  Without the patch, search the trace file for the lines 
beginning with "8160".  There will be two such lines: one from the first 
"iteration" of the repro, and one from the second.  With Derby Client, the 
first occurence looks like this:

8160   000000000401F100  0000000000000000 

and the second one looks like this:

8160   000000000001                       

If I'm understanding correctly, the "01" in the second line is INCORRECT--it 
should be "00".  But because of the off-by-one error, the "01" is left over 
from the first iteration, which is a manifestation of the problem described for 
DERBY-125.  A similar phenomenon can be seen from the JCC traces:

First occurence:

8160   000000000401F100  0000000000000000  

Second occurence:

8160   000000000001002C  D052000300262205 

For JCC, the second occurrence has additional data in it--it looks like an 
OPNQRYRM is chained to the data?  This might be a result of the 'defer' 
behavior, I'm not sure.

For whatever reason, this error goes "unseen" by the Derby Client; but for JCC, 
the result is an error in the repro, namely:

Exception in thread "main" com.ibm.db2.jcc.c.DisconnectException: Execution 
failed due to a distribution protocol error that caused deallocation of the 
conversation.  A DRDA Data Stream Syntax Error was detected.  Reason: 0x3
        at com.ibm.db2.jcc.a.bb.l(bb.java:1206)
        at com.ibm.db2.jcc.a.bb.c(bb.java:363)
        at com.ibm.db2.jcc.a.bb.v(bb.java:1439)
        at com.ibm.db2.jcc.a.eb.c(eb.java:56)
        at com.ibm.db2.jcc.a.r.c(r.java:42)
        at com.ibm.db2.jcc.a.tb.h(tb.java:169)
        at com.ibm.db2.jcc.c.zc.p(zc.java:1223)
        at com.ibm.db2.jcc.c.ad.d(ad.java:2246)
        at com.ibm.db2.jcc.c.ad.U(ad.java:489)
        at com.ibm.db2.jcc.c.ad.executeQuery(ad.java:472)

So while this isn't ideal, at least we'd have a test for JCC that didn't rely 
on any other fixes (esp. DERBY-492) and that would indicate a regression.  That 
said, though, things to consider would be:

1) why does this error occur in JCC but not in the client?  Is it related to 
deferred prepares, or something else?  Can/should the client be recognizing 
this error and doing something about it?

2) Is there some way to modify this repro so that the client fails, too?

All of that said, I have to admit the repro is a bit fragile and hack-ish.  In 
order to get this to work, the second iteration has to have a table name that 
is exactly 99 characters long and a column name that is 97 characters.  There 
are other lengths that will cause the problem, as well--but just randomly using 
two lengths probably won't work.  So yes, it's ugly.

Nonetheless, at least we have a test case that fails without the DERBY-125 
patch and passes with it, which was the big goal.  If after reading this 
comment anyone feels like this is too hacky/unreliable, then so be it--I won't 
be offended :)

I don't really have time to continue looking at this, but I'm hoping someone 
out there can follow up on this and turn it into something useful.  My 
apologies for not taking this further, but I have a lot going on right now...

To run the repro:

> java offByOne

  --> Runs with Derby Client

> java offByOne jcc

  --> Runs with JCC

Hope this is helpful...

> Network Server can send DSS greater than 32K to client, which breaks DRDA 
> protocol.
> -----------------------------------------------------------------------------------
>
>          Key: DERBY-125
>          URL: http://issues.apache.org/jira/browse/DERBY-125
>      Project: Derby
>         Type: Bug
>   Components: Network Server
>     Reporter: A B
>     Assignee: Bryan Pendleton
>  Attachments: changes.html, offByOne.jar, repro.java, svn_jan_12_2006.diff, 
> svn_jan_12_2006.status
>
> BACKGROUND:
> DRDA protocol, which is the protocol used by Derby Network Server, dictates 
> that all DSS objects "with data greater than 32,763 bytes" should be broken 
> down into multiple "continuation" DSSes.
> PROBLEM:
> When Network Server receives a "prepareStatement" call that has a very large 
> number of parameters, it can end up sending a reply DSS that is greater than 
> 32K long to the client; doing so breaks DRDA protocol.
> REPRODUCTION:
> Note that this reproduction does NOT cause a protocol exception against the 
> JCC driver--without further investigation, it would appear JCC doesn't mind 
> that the DSS is too long.  However, other DRDA clients (such as the DB2 ODBC 
> client) will see that the data is too long and will fail because of it.
> To reproduce, one can create a simple table and then prepare a statement such 
> as:
> SELECT id FROM t1 WHERE id in ( ?, ?, [ ... lots and lots of param markers 
> ... ], ?)
> Note that JCC uses deferred prepare by default; when connecting, one must 
> append the "deferPrepares=false" attribute to the end of the URL in order to 
> reproduce the problem (that or just try to execute the statement, since 
> preparation will be done at execution time).  When doing the prepare, I added 
> a line in the "flush" method of org.apache.derby.impl.drda.DDMWriter.java to 
> see how large the reply DSS was, and for cases where the number of parameter 
> markers was high, the number of bytes in the single DSS would surpass 32K, 
> and thus break protocol.
> NOTES:
> Network Server correctly uses continuation DSSes for LOBs and for result set 
> data (data returned as the result of a query) that is greater than 32K.  The 
> problem appears to be in "other" cases, such as for the prepareStatement call 
> described above.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira

Reply via email to