On Thursday 14 October 2010 8:36:33 pm Benson Margulies wrote: > Your analysis seems plausible to me. To test this for sure, it would > require finding a way to inject a stream that would deliver the right > wrong stuff. That would be an entertaining programming project. Dan, > what do you think?
it shouldn't be TOO tricky. Take a look at the AttachmentDeserializerTest in rt/core. It creates ByteArrayInputStreams for various test cases, but we could create a subclass of ByteArrayInputStream that only allows reads in chunks of 20 bytes or something to see what happens. Dan > > On Thu, Oct 14, 2010 at 6:56 PM, Pieper, Aaron <[email protected]> wrote: > > I'm using CXF 2.2.10. I'm having a problem with some MTOM attachments. It > > started when I upgraded from CXF 2.2.2 to CXF 2.2.3. The bug is that > > after calling a service which returned an MTOM attachment, when I try to > > parse the attachment, I sometimes get an error: > > > > java.io.IOException: Underlying input stream returned zero bytes > > at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:268) > > at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:306) > > at sun.nio.cs.StreamDecoder.READ(StreamDecoder.java:158) > > at java.io.InputStreamReader.READ(InputStreamReader.java:167) > > at java.io.Reader.READ(Reader.java:123) > > at org.apache.commons.io.IOUtils.copyLarge(IOUtils.java:1128) > > at org.apache.commons.io.IOUtils.copy(IOUtils.java:1104) > > at org.apache.commons.io.IOUtils.copy(IOUtils.java:1050) > > at org.apache.commons.io.IOUtils.toString(IOUtils.java:359) > > at com.pragmatics.AsyncUtils.messageToString(AsyncUtils.java:18) > > > > The error only happens for some attachments - about 25% of them. It's a > > seemingly arbitrary 25% - it's not like, the biggest 25% or the ones > > that have special characters. I was able to track this down to > > MimeBodyPartInputStream. MimeBodypartInputStream has some logic in > > processBuffer for reading the boundary. It goes like this: > > > > while ((boundaryIndex < boundary.length) && (value == > > boundary[boundaryIndex])) { if (!hasData(buffer, initialI, i + 1, off, > > len)) { > > return initialI - off; > > } > > value = buffer[++i]; > > boundaryIndex++; > > } > > > > So, basically, when MimeBodyPartInputStream finds the start of a > > boundary, it reads from the stream until either there's no more > > characters to read, or until it read the entire boundary. The problem > > with this logic is that it assumes the entire boundary will be read in > > the same call to the underlying InputStream. This assumption isn't > > always true. Specifically, when I'm fetching an attachment in my > > application, this MimeBodyPartInputStream is backed by an > > HttpURLConnection.HttpInputStream. This HttpInputStream sometimes > > fetches as few as 24 characters, I guess that's just how the > > HttpInputStream works. But if these 24 characters happen to fall on one > > of these MIME boundaries, it can cause problems. > > > > One problem, which I'm running into here, is that the > > MimeBodyPartInputStream's read(byte,int,int) method returns 0, since the > > only bytes that were read were parts of the MIME boundary. In returning > > 0, it breaks InputStream's contract which says states that the read > > method will only ever return a positive integer (if some bytes were > > read) or -1 (if no bytes were read.) There are probably other possible > > problems - it seems like it's possible MimeBodyPartInputStream might > > misunderstand whether or not it's hit a boundary in some cases. I > > haven't run into that problem though. > > > > I was hesitant to submit a tracker for this issue, since I don't 100% > > understand all of the pieces involved. Since the bug is dependent on > > HttpInputStream, I haven't really been able to create a test case for > > it, unless I do weird things like create my own InputStream class which > > behaves in weird ways. Should I submit it anyway? It fortunately only > > causes problems in my test code, but it seems like an important issue. > > > > - Aaron -- Daniel Kulp [email protected] http://dankulp.com/blog
