Hi Joe, here is the updated webrev where I incorporated your suggestions:
http://cr.openjdk.java.net/~clanger/webrevs/8153781.1/ I also added a testcase. As for the message " DoctypedeclNotClosed ": I did it in several languages but there are some languages where I don't have the knowledge to create a localized string. Is there some process for this or would we just be ok with reverting to the English text for that particular message? Thanks in advance for your review :) All the best Christoph > -----Original Message----- > From: huizhe wang [mailto:huizhe.w...@oracle.com] > Sent: Dienstag, 12. April 2016 23:28 > To: Langer, Christoph <christoph.lan...@sap.com> > Cc: core-libs-dev@openjdk.java.net > Subject: Re: RFR: JDK-8153781 Issue in XMLScanner: > EXPECTED_SQUARE_BRACKET_TO_CLOSE_INTERNAL_SUBSET when skipping > large DOCTYPE section with CRLF at wrong place > > > On 4/12/2016 11:50 AM, Langer, Christoph wrote: > > Hi Joe, > > > > thanks as always for your suggestions and I'll try to work it in. It will > > probably > take me a little while as I'm currently busy with another thing. I'll update > my > webrev eventually and add a testcase, too. > > Ok. > > > > But one question: I understand that the fix in skipDTD will be sufficient > > to fix > the issue. Nevetheless, can you explain me why the change in scanData is not > beneficial or could even cause issues? There are several places in scanData > when further loads are done. But only at this point where there's exactly one > character after CRLF at the end of the buffer the method just returns without > further load. If it wouldn't leave here it seems to me as if it would continue > correctly and load more data. That would probably also be better in the sense > of performance I guess?? > > It's there to handle the situation where a surrogate pair got split in > between buffers. > > -Joe > > > > > Thanks > > Christoph > > > >> -----Original Message----- > >> From: huizhe wang [mailto:huizhe.w...@oracle.com] > >> Sent: Dienstag, 12. April 2016 19:54 > >> To: Langer, Christoph <christoph.lan...@sap.com> > >> Cc: core-libs-dev@openjdk.java.net > >> Subject: Re: RFR: JDK-8153781 Issue in XMLScanner: > >> EXPECTED_SQUARE_BRACKET_TO_CLOSE_INTERNAL_SUBSET when > skipping > >> large DOCTYPE section with CRLF at wrong place > >> > >> Also, EXPECTED_SQUARE_BRACKET_TO_CLOSE_INTERNAL_SUBSET was a > >> wrong msg > >> id. It would be good to change that to DoctypedeclNotClosed and add a > >> message to XMLMessages.properties right before > DoctypedeclUnterminated, > >> sth. like the following: > >> > >> DoctypedeclNotClosed = The document type declaration for root element > >> type \"{0}\" must be closed with '']''. > >> > >> Thanks, > >> Joe > >> > >> On 4/11/2016 5:49 PM, huizhe wang wrote: > >>> On 4/7/2016 1:45 PM, Langer, Christoph wrote: > >>>> Hi, > >>>> > >>>> > >>>> > >>>> I've run into a peculiar issue with Xerces. > >>>> > >>>> > >>>> > >>>> The problem is happening when a DTD shall be skipped, the DTD is > >>>> larger than the buffer of the current entity and a CRLF sequence > >>>> occurs just one char before the buffer end. > >>>> > >>>> > >>>> > >>>> The reason is that method skipDTD of class XMLDTDScannerImpl (about > >>>> line 389) calls XMLEntityScanner.scanData() to scan for the next > >>>> occurrence of ']'. The scanData method might return true which > >>>> indicates that the delimiter ']' was not found yet and more data is > >>>> to scan. Other users of scanData would handle this and call this > >>>> method in a loop until it returns false or some other condition > >>>> happens. So I've fixed that place like at the other callers of scanData. > >>> This part of the change looks good. > >>>> > >>>> > >>>> Nevertheless, the scanData method would usually load more data when > >>>> it is at the end of the buffer. But in the special case when CRLF is > >>>> found at the end of buffer - 1, scanData would just return true. So I > >>>> also removed that check at line 1374 in XMLEntityScanner. Do you see > >>>> any problem with that? Is there any reason behind it which I'm > >>>> overseeing? > >>> No need to remove this after the above change. The parser needs to > >>> retain what's in the xml, e.g., not removing new lines. > >>>> Furthermore I took the chance for further little cleanups. I've added > >>>> the new copyright header to the files... is that the correct one? > >>> Yes, that's the right license header. However, > >>>> > >>>> I also aligned the calls to invokeListeners(position) in > >>>> XMLEntityScanner to always pass the actual position from which the > >>>> load is started. Do you think this is correct? > >>> Yes. > >>>> > >>>> > >>>> Here is the bug: > >>>> > >>>> https://bugs.openjdk.java.net/browse/JDK-8153781 > >>>> > >>>> > >>>> > >>>> Here is the webrev: > >>>> > >>>> http://cr.openjdk.java.net/~clanger/webrevs/8153781.0/ > >>>> > >>>> > >>>> > >>>> Please give me some comments before I finalize my change including a > >>>> jtreg testcase. > >>> It would be better if you had included the testcase so that the review > >>> can be done together with the code change. > >>> > >>> Thanks, > >>> Joe > >>> > >>>> > >>>> > >>>> Thanks & Best regards > >>>> > >>>> Christoph > >>>> > >>>> > >>>>