Re: RFR for 6443578: Continuation lines in JAR manifests do not follow RFC-822

Philipp Kunz Fri, 04 May 2018 01:27:08 -0700

Hi Sherman,

Thanks again for pointing out that fewer changes are sufficient. Iprepared another patch for 6443578.


http://files.paratix.ch/jdk/6443578/webrev.02/
http://files.paratix.ch/jdk/6443578/webrev.02.zip

The approach is basically, not to put a line break where a utf charactercontinuation byte follows which can be tested with a bit mask.

For telling first from continuation bytes of utf characters I re-used amethod isNotUtfContinuationByte from either StringCoding orUTF_8.Decoder. Unfortunately I found no way not to duplicate it which Iconsider subject of already known bug 6184334. In case this patch isaccepted, I suggest to add a comment to 6184334 that refers to the newcopy in Manifest#make72Safe.

LineBrokenMultiByteCharacter should not be removed or not so immediatelyat least because someone might attempt to sign an older jarfile createdwithout that patch with a newer jarsigner that already contains it ifthe jar previously contained digests presumably from another signature.


Looking forward to any feedback and with kind regards,
Philipp




On 03.05.2018 08:56, Xueming Shen wrote:

Philipp,
I kinda recalled JDK-6202130 which I read long time ago and I believedit's not a bug butjust wanted to leave it open and for double check, then it's off thepriority list somehow...
Reread the code below I think it handles utf8 and the 72-length-limitappropriately, though
a little tricky,

(1) value.getByte("utf8") + deprecated String(byte[], int, int, int);
which creates a String that each "char" inside that String objectactually represents one bytevalue of the resulted utf8 byte sequence, with its utf16 value =[hi-byte= 0 and low-byte="utf8-byte"]
(2) append this String to the StringBuffer (--> StringBuilder), nowthe sb.length() actually is thelength of the utf8 byte sequence, make72Safe(...) is used toadjust the length to 72 working
     on "chars" inside the buffer.

(3) write out the adjusted buffer via DataOutputStream.writeBytes()
in which the "write" cuts off the hi-byte of that utf16, so youactually get the original
     utf-8 byte sequence output to the stream.
Sure the implementation looks "interesting" and uses deprecated Stringconstructor, but it was
written 2 decades ago.
Back then I think we were thinking maybe the fix is to simply removethe "misleading" commentline below in the source code, if the implementation actually supportsutf-8 + 72-limit.
" * XXX Need to handle UTF8 values and break up lines longer than72 bytes"
Any reason/test case to believe the mechanism does not work?
BTW the logic of line below in writeMain() appears to be incorrect. Ido have another bug for it." if ((version != null) &&!(name.equalsIgnoreCase(vername))) {"
----------------------

                StringBuffer buffer = new StringBuffer(name);
                buffer.append(": ");

                String value = (String) e.getValue();
                if (value != null) {
                    byte[] vb = value.getBytes("UTF8");
                    value = new String(vb, 0, 0, vb.length);
                }
                buffer.append(value);

                Manifest.make72Safe(buffer);
                buffer.append("\r\n");
                out.writeBytes(buffer.toString());

-------------------------

sherman



On 5/1/18, 10:21 PM, Philipp Kunz wrote:
Hi,
Recently, I tried to fix only bug 6202130 with the intention to fixbug 6443578 later with the intention to get some opportunity forfeedback, but haven't got any, and propose now a fix for bothtogether which in my opinion makes more sense.
See attached patch.

Some considerations, assumptions, and explanations

 * In my opinion, the code for writing manifests was distributed in the
   two classes Attributes and Manifest in an elegant way but somewhat
   difficult to explain the coherence. I chose to group the code that
   writes manifests into a new class ManifestWriter. The main incentive
   for that was to prevent or reduce duplicated code I would have had
   to change twice otherwise. This also results in a source file of a
   suitable size.
 * I could not support the assumption that the write and writeMain
   methods in Attributes couldn't be referenced anywhere so I
   deprecated them rather than having them removed.
 * I assumed the patch will not make it into JDK 10 and, hence, the
   deprecated annotations are attributed with since = 11.
 * I could not figure out any reason for the use of DataOutputStream
   and did not use it.
 * Performance-wise I assume that the code is approximately comparable
   to the previous version. The biggest improvement in this respect I
   hope comes from removing the String that contains the byte array
   constructed with deprecated String(byte[], int, int, int) and then
   copying it over again to a StringBuffer and from there to a String
   again and then Characters. On the other hand, keeping whole
   characters together when breaking lines might make it slightly
   slower. I hope my changes are an overall improvement, but I haven't
   measured it.
 * For telling first from continuation bytes of utf-8 characters apart
   I re-used a method isNotUtfContinuationByte from either StringCoding
   or UTF_8.Decoder. Unfortunately I found no way not to duplicate it.
 * Where it said before "XXX Need to handle UTF8 values and break up
   lines longer than 72 bytes" in Attributes#writeMain I did not dare
   to remove the comment completely because it still does not deal
   correctly with version headers longer than 72 bytes and the set of
   allowed values. I changed it accordingly. Two similar comments are
   removed in the patch.
 * I added two tests, WriteDeprecated and NullKeysAndValues, to
   demonstrate compatibility as good as I could. Might however not be
   desired to keep and having to maintain.
 * LineBrokenMultiByteCharacter for jarsigner should not be removed or
   not so immediately because someone might attempt to sign an older
   jarfile created without that patch with a newer jarsigner that
   already contains it.
suggested changes or additions to the bug database: (i have nopermissions to edit it myself)
 * Re-combine copies of isNotUtfContinuationByte (three by now).
   Relates to 6184334. Worth to file another issue?
 * Manifest versions have specific specifications, cannot break across
   lines and can contain a subset of characters only. Bug 6910466
   relates but is not exactly the same. If someone else is convinced
   that writing a manifest should issue a warning or any other way to
   deal with a version that does not conform to the specification, I'd
   suggest to create a separate bug for that.
Now, I would be glad if someone sponsored a review. This is only mythird attempt to submit a patch which is why I chose a lesserimportant subject to fix in order to get familiar and now Iunderstand it's not the most attractive patch to review. Please don'thesitate to suggest what I could do better or differently.
As a bonus, with these changes, manifest files will always bedisplayed correctly with just any utf capable viewer even if theycontain multi-byte utf characters that would have been broken acrossa line break with the current/previous implementation and allmanifests will become also valid strings in Java.


--


Gruss Philipp



------------------------------------------------------------------------



Paratix GmbH
St Peterhofstatt 11
8001 Zürich

+41 (0)76 397 79 35
philipp.k...@paratix.ch <mailto:philipp.k...@paratix.ch>

--- old/src/java.base/share/classes/java/util/jar/Manifest.java	2018-05-04 08:08:06.786283203 +0200
+++ new/src/java.base/share/classes/java/util/jar/Manifest.java	2018-05-04 08:08:06.554281154 +0200
@@ -166,12 +166,14 @@
     }
 
     /**
-     * Adds line breaks to enforce a maximum 72 bytes per line.
+     * Adds line breaks to enforce a maximum of 72 bytes per line and with
+     * respect not to break within multi-byte encoded utf characters.
      */
     static void make72Safe(StringBuffer line) {
         int length = line.length();
         int index = 72;
         while (index < length) {
+            while ((line.charAt(index) & 0xc0) == 0x80) index--;
             line.insert(index, "\r\n ");
             index += 74; // + line width + line break ("\r\n")
             length += 3; // + line break ("\r\n") and space
--- /dev/null	2018-04-29 19:08:02.331411111 +0200
+++ new/test/jdk/java/util/jar/Manifest/LineBreakCharacter.java	2018-05-04 08:08:07.226287091 +0200
@@ -0,0 +1,403 @@
+/*
+ * Copyright (c) 2018, Oracle and/or its affiliates. All rights reserved.
+ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
+ *
+ * This code is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License version 2 only, as
+ * published by the Free Software Foundation.
+ *
+ * This code is distributed in the hope that it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+ * version 2 for more details (a copy is included in the LICENSE file that
+ * accompanied this code).
+ *
+ * You should have received a copy of the GNU General Public License version
+ * 2 along with this work; if not, write to the Free Software Foundation,
+ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA.
+ *
+ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA
+ * or visit www.oracle.com if you need additional information or have any
+ * questions.
+ */
+
+import static java.nio.charset.StandardCharsets.UTF_8;
+
+import java.io.ByteArrayInputStream;
+import java.io.ByteArrayOutputStream;
+import java.io.DataOutputStream;
+import java.io.IOException;
+import java.util.jar.Attributes;
+import java.util.jar.Manifest;
+import java.util.jar.Attributes.Name;
+import java.util.List;
+import java.util.LinkedList;
+
+import org.testng.annotations.Test;
+import org.testng.annotations.DataProvider;
+import static org.testng.Assert.*;
+
+/**
+ * @test
+ * @bug 6443578
+ * @run testng LineBreakCharacter
+ * @summary Tests reading and writing of jar manifests with headers broken
+ *          across lines in conjunction with multi-byte utf characters.
+ * <p>
+ * Line breaks may or may not occur within utf encoded characters that are
+ * represented with more than one byte. Even though characters should not be
+ * broken across lines according to the specification the previous Manifest
+ * implementation did it and this test makes sure that no multi-byte characters
+ * are broken apart across a line break when writing manifests and manifests
+ * are read correctly no matter if multi-byte characters are continued after a
+ * line break.
+ */
+public class LineBreakCharacter {
+
+    static final int MANIFEST_LINE_CONTENT_WIDTH_BYTES = 72;
+
+    static final String FILL = "x";
+
+    /**
+     * By using names of four characters the same values can be used for
+     * testing line breaks at exact positions for both header and section
+     * names, header names in main and named sections, because a named
+     * section name is represented with a header with key "Name" that occurs
+     * after a blank line and also has four characters.
+     */
+    static final String NAME = FILL + FILL + FILL + FILL;
+
+    /**
+     * Cover main attributes headers, section names, and headers in named
+     * sections because an implementation might make a difference.
+     */
+    enum PositionInManifest {
+        MAIN_ATTRIBUTES, SECTION_NAME, NAMED_SECTION;
+    }
+
+    /**
+     * @see LineBreakLineWidth#numByteUtfCharacter(int, int)
+     */
+    static String numByteUtfCharacter(int numBytes, int seed) {
+        seed = seed < 0 ? -seed : seed;
+        if (numBytes == 1) {
+            seed %= 0x5F;
+            seed += 0x20; // exclude control characters (0..0x19) here
+        } else if (numBytes == 2) {
+            seed %= 0x800 - 0x80;
+            seed += 0x80;
+        } else if (numBytes == 3) {
+            seed %= 0x10000 - 0x800 + 0xFDD0 - 0xFDEF + 0xFFFE - 0xFFFF;
+            seed += 0x800
+                    + (seed >= 0xFDD0 ? 0xFDEF - 0xFDD0 : 0) // non-characters
+                    + (seed % 0x10000) * (0xFFFF - 0xFFFE); // byte order marks
+        } else {
+            seed %= 0x110000 - (0x10000 - 0xFFFE);
+            seed += 0x10000
+                    + (seed % 0x10000) * (0xFFFF - 0xFFFE); // byte order marks
+        }
+
+        String string = new String(Character.toChars(seed));
+        assertEquals(string.getBytes(UTF_8).length, numBytes,
+                "self-test failed: unexpected utf encoded character length");
+        return string;
+    }
+
+    @DataProvider(name = "lineBreakParameters")
+    public static Object[][] lineBreakParameters() {
+        LinkedList<Object[]> params = new LinkedList<>();
+
+        // b: number of line breaks before character under test
+        for (int b = 0; b <= 3; b++) {
+
+           // c: character utf encoded length in bytes
+           for (int c = 1; c <= 4; c++) {
+
+                // p: potential break position offset in bytes
+                // p = 0 => before character
+                // p = c => after character and
+                // 0 < p < c => character potentially broken across line break
+                //              within the character
+                for (int p = c; p >= 0; p--) {
+
+                    // a: no or one character following the one under test
+                    // (a = 0 meaning the character under test is followed by
+                    // end of value which is also represented as a line break)
+                    for (int a = 0; a <= 1; a++) {
+
+                        // offset: so many characters (actually bytes here
+                        // filled with one byte characters) are needed to place
+                        // the next character into a position relative to the
+                        // maximum line width that it may or may not have to be
+                        // broken onto the next line
+                        int offset =
+                                // number of lines; - 1 is continuation space
+                                b * (MANIFEST_LINE_CONTENT_WIDTH_BYTES - 1)
+                                // line length minus "Name: ".length()
+                                + MANIFEST_LINE_CONTENT_WIDTH_BYTES - 6
+                                // position of maximum line width relative to
+                                // beginning of encoded character
+                                - p;
+                        int seed = ((a * 2 + p) * c + c) * 4 + b;
+                        String value = "";
+                        for (int i = 0; i < offset - 1; i++) {
+                            value += numByteUtfCharacter(1, seed++);
+                        }
+                        // character before the one to test the break
+                        value += FILL;
+                        String character = numByteUtfCharacter(c, seed);
+                        value += character;
+                        for (int i = 0; i < a; i++) {
+                            // character after the one to test the break
+                            value += FILL;
+                        }
+
+                        for (PositionInManifest i :
+                                PositionInManifest.values()) {
+
+                            params.add(new Object[] {
+                                    b, c, p, a, i, character, value});
+                        }
+                    }
+                }
+            }
+        }
+
+        return params.toArray(new Object[0][0]);
+    }
+
+    /**
+     * Checks that multi-byte utf characters work well with line breaks and
+     * continuation lines in jar manifests.
+     *
+     * Produces otherwise arbitrary manifests so that a specific character
+     * will not just fit on the same line and a line break and continuation
+     * line are required with:<ul>
+     * <li>different encoded sizes of characters: one, two, three, and four
+     * bytes per character (including also utf bmp and surrogate pairs)</li>
+     * <li>different amount of space remaining on the same line or relative
+     * position of the latest possible line break position with respect to the
+     * character that can be continued on the same line or will have to be
+     * continued on the next line after a line break</li>
+     * <li>different number of preceding line breaks</li>
+     * </ul>
+     * For each of these cases the break position is verified in the binary
+     * manifest.
+     */
+    @Test(dataProvider = "lineBreakParameters")
+    public void testWriteLineBreaksKeepCharactersTogether(int b, int c, int p,
+            int a, PositionInManifest i, String character, String value)
+                    throws IOException {
+        byte[] mfBytes = writeManifest(i, NAME, value);
+
+        // expect the whole character on the next line unless it fits
+        // completely on the current line
+        boolean breakExpected = p < c;
+        String brokenPart = character;
+        if (breakExpected) {
+            brokenPart = "\r\n " + brokenPart;
+        }
+        // expect a line break before the next character if there is a next
+        // character and the previous not already broken on next line
+        if (a == 1) {
+            if (!breakExpected) {
+                brokenPart += "\r\n ";
+            }
+            brokenPart += FILL;
+        }
+        brokenPart = FILL + brokenPart + "\r\n";
+        assertOccurrence(mfBytes, brokenPart.getBytes(UTF_8));
+        readManifestAndAssertValue(mfBytes, i, NAME, value);
+    }
+
+    static byte[] writeManifest(PositionInManifest i, String name,
+            String value) throws IOException {
+        Manifest mf = new Manifest();
+        mf.getMainAttributes().put(Name.MANIFEST_VERSION, "1.0");
+        Attributes attributes = new Attributes();
+
+        switch (i) {
+        case MAIN_ATTRIBUTES:
+            mf.getMainAttributes().put(new Name(name), value);
+            break;
+        case SECTION_NAME:
+            mf.getEntries().put(value, attributes);
+            break;
+        case NAMED_SECTION:
+            mf.getEntries().put(NAME, attributes);
+            attributes.put(new Name(name), value);
+            break;
+        }
+
+        ByteArrayOutputStream out = new ByteArrayOutputStream();
+        mf.write(out);
+        byte[] mfBytes = out.toByteArray();
+        System.out.println("-------------------------------------------"
+                + "-----------------------------");
+        System.out.print(new String(mfBytes, UTF_8));
+        System.out.println("-------------------------------------------"
+                + "-----------------------------");
+        return mfBytes;
+    }
+
+    static void assertOccurrence(byte[] mf, byte[] part) {
+        List<Integer> matchPos = new LinkedList<>();
+        for (int i = 0; i < mf.length; i++) {
+            for (int j = 0; j < part.length && i + j <= mf.length; j++) {
+                if (part[j] == 0) {
+                    if (i + j != mf.length) {
+                        break; // expected eof not found
+                    }
+                } else if (i + j == mf.length) {
+                    break;
+                } else if (mf[i + j] != part[j]) {
+                    break;
+                }
+                if (j == part.length - 1) {
+                    matchPos.add(i);
+                }
+            }
+        }
+        assertEquals(matchPos.size(), 1, "not "
+                + (matchPos.size() < 1 ? "found" : "unique") + ": '"
+                + new String(part, UTF_8) + "'");
+    }
+
+    static void readManifestAndAssertValue(
+            byte[] mfBytes, PositionInManifest i, String name, String value)
+                    throws IOException {
+        Manifest mf = new Manifest(new ByteArrayInputStream(mfBytes));
+
+        switch (i) {
+        case MAIN_ATTRIBUTES:
+            assertEquals(mf.getMainAttributes().getValue(name), value,
+                    "main attributes header value");
+            break;
+        case SECTION_NAME:
+            Attributes attributes = mf.getAttributes(value);
+            assertNotNull(attributes, "named section not found");
+            break;
+        case NAMED_SECTION:
+            attributes =
+                    mf.getAttributes(NAME);
+            assertEquals(attributes.getValue(name), value,
+                    "named section attributes header value");
+            break;
+        }
+    }
+
+    @Test(dataProvider = "lineBreakParameters")
+    public void readContinuationLines(int b, int c, int p, int a,
+            PositionInManifest i, String character, String value)
+                    throws IOException {
+        byte[] mfBytes = writeManifestWithBrokenCharacters(i, NAME, value);
+
+        ByteArrayOutputStream buf = new ByteArrayOutputStream();
+        buf.write(FILL.getBytes(UTF_8));
+        byte[] characterBytes = character.getBytes(UTF_8);
+        // the portion of the character that fits on the current line before
+        // a break at 72 bytes, ranges from nothing (p == 0) to the whole
+        // character (p == c)
+        for (int j = 0; j < p; j++) {
+            buf.write(characterBytes, j, 1);
+        }
+        // expect a line break at exactly 72 bytes from the beginning of the
+        // line unless the whole character fits on that line
+        boolean breakExpected = p < c;
+        if (breakExpected) {
+            buf.write("\r\n ".getBytes(UTF_8));
+        }
+        // the remaining portion of the character, if any
+        for (int j = p; j < c; j++) {
+            buf.write(characterBytes, j, 1);
+        }
+        // expect another linebreak if the whole character fitted on the same
+        // line and there is another character
+        if (a == 1) {
+            if (c == p) {
+                buf.write("\r\n ".getBytes(UTF_8));
+            }
+            buf.write(FILL.getBytes(UTF_8));
+        }
+        buf.write("\r\n".getBytes(UTF_8));
+        byte[] brokenPart = buf.toByteArray();
+        assertOccurrence(mfBytes, brokenPart);
+        readManifestAndAssertValue(mfBytes, i, NAME, value);
+    }
+
+    /*
+     * From previous Manifest implementation reduced to the minimum required to
+     * demonstrate compatibility
+     */
+    @SuppressWarnings("deprecation")
+    static byte[] writeManifestWithBrokenCharacters(
+            PositionInManifest i, String name, String value)
+                    throws IOException {
+        byte[] vb = value.getBytes("UTF8");
+        value = new String(vb, 0, 0, vb.length);
+        ByteArrayOutputStream out = new ByteArrayOutputStream();
+        DataOutputStream dos = new DataOutputStream(out);
+        String vername = Name.MANIFEST_VERSION.toString();
+        dos.writeBytes(vername + ": 0.1\r\n");
+
+        if (i == PositionInManifest.MAIN_ATTRIBUTES) {
+            StringBuffer buffer = new StringBuffer(name);
+            buffer.append(": ");
+            buffer.append(value);
+            make72Safe(buffer);
+            buffer.append("\r\n");
+            dos.writeBytes(buffer.toString());
+        }
+        dos.writeBytes("\r\n");
+
+        if (i == PositionInManifest.SECTION_NAME ||
+                i == PositionInManifest.NAMED_SECTION) {
+            StringBuffer buffer = new StringBuffer("Name: ");
+            if (i == PositionInManifest.SECTION_NAME) {
+                buffer.append(value);
+            } else {
+                buffer.append(NAME);
+            }
+            make72Safe(buffer);
+            buffer.append("\r\n");
+            dos.writeBytes(buffer.toString());
+
+            if (i == PositionInManifest.NAMED_SECTION) {
+                buffer = new StringBuffer(name);
+                buffer.append(": ");
+                buffer.append(value);
+                make72Safe(buffer);
+                buffer.append("\r\n");
+                dos.writeBytes(buffer.toString());
+            }
+
+            dos.writeBytes("\r\n");
+        }
+
+        dos.flush();
+        byte[] mfBytes = out.toByteArray();
+        System.out.println("-------------------------------------------"
+                + "-----------------------------");
+        System.out.print(new String(mfBytes, UTF_8));
+        System.out.println("-------------------------------------------"
+                + "-----------------------------");
+        return mfBytes;
+    }
+
+    /**
+     * Adds line breaks to enforce a maximum 72 bytes per line.
+     * From previous Manifest implementation without respect for utf encoded
+     * character boundaries breaking also within multi-byte characters.
+     */
+    static void make72Safe(StringBuffer line) {
+        int length = line.length();
+        int index = 72;
+        while (index < length) {
+            line.insert(index, "\r\n ");
+            index += 74; // + line width + line break ("\r\n")
+            length += 3; // + line break ("\r\n") and space
+        }
+        return;
+    }
+
+}
--- old/test/jdk/java/util/jar/Manifest/LineBreakLineWidth.java	2018-05-04 08:08:08.250296138 +0200
+++ new/test/jdk/java/util/jar/Manifest/LineBreakLineWidth.java	2018-05-04 08:08:08.018294088 +0200
@@ -26,9 +26,13 @@
 import java.io.ByteArrayInputStream;
 import java.io.ByteArrayOutputStream;
 import java.io.IOException;
+import java.util.Set;
+import java.util.TreeSet;
 import java.util.jar.Manifest;
 import java.util.jar.Attributes;
 import java.util.jar.Attributes.Name;
+import java.util.stream.Collectors;
+import java.util.stream.IntStream;
 
 import org.testng.annotations.Test;
 import static org.testng.Assert.*;
@@ -37,12 +41,19 @@
  * @test
  * @bug 6372077
  * @run testng LineBreakLineWidth
- * @summary write valid manifests with respect to line breaks
- *          and read any line width
+ * @summary Write valid manifests with respect to line widths breaking longer
+ *          lines, especially for 6372077 that the lines are sufficiently wide
+ *          to contain all possible valid header names, and read any line
+ *          width.
  */
 public class LineBreakLineWidth {
 
     /**
+     * maximum line width not including the line break
+     */
+    final static int MAX_LINE_CONTENT_LENGTH = 72;
+
+    /**
      * maximum header name length from {@link Name#isValid(String)}
      * not including the name-value delimiter <code>": "</code>
      */
@@ -60,20 +71,27 @@
     final static int TEST_WIDTH_RANGE = 123;
 
     /**
-     * tests if only valid manifest files can written depending on the header
+     * Tests if only valid manifest files can written depending on the header
      * name length or that an exception occurs already on the attempt to write
      * an invalid one otherwise and that no invalid manifest can be written.
      * <p>
-     * due to bug JDK-6372077 it was possible to write a manifest that could
-     * not be read again. independent of the actual manifest line width, such
+     * Due to bug JDK-6372077 it was possible to write a manifest that could
+     * not be read again. Independent of the actual manifest line width, such
      * a situation should never happen, which is the subject of this test.
+     * <p>
+     * While this test covers header names which can contain only one-byte utf
+     * encoded characters and the resulting manifest line widths, multi-byte
+     * utf encoded characters can occur in header values, which is subject of
+     * {@link #testManifestLineWidthMaximumRange()}.
      */
     @Test
     public void testWriteValidManifestOrException() throws IOException {
+        assertTrue(TEST_WIDTH_RANGE > MAX_HEADER_NAME_LENGTH);
+
         /*
-         * multi-byte utf-8 characters cannot occur in header names,
+         * Multi-byte utf-8 characters cannot occur in header names,
          * only in values which are not subject of this test here.
-         * hence, each character in a header name uses exactly one byte and
+         * Hence, each character in a header name uses exactly one byte and
          * variable length utf-8 character encoding doesn't affect this test.
          */
 
@@ -111,18 +129,18 @@
     }
 
     /**
-     * tests that manifest files can be read even if the line breaks are
+     * Tests that manifest files can be read even if the line breaks are
      * placed in different positions than where the current JDK's
      * {@link Manifest#write(java.io.OutputStream)} would have put it provided
      * the manifest is valid otherwise.
      * <p>
-     * the <a href="{@docRoot}/../specs/jar/jar.html#Notes_on_Manifest_and_Signature_Files">
-     * "Notes on Manifest and Signature Files" in the "JAR File
-     * Specification"</a> state that "no line may be longer than 72 bytes
-     * (not characters), in its utf8-encoded form." but allows for earlier or
+     * The <a href="{@docRoot}/../specs/jar/jar.html#Notes_on_Manifest_and_Signature_Files">
+     * "Notes on Manifest and Signature Files"</a> in the "JAR File
+     * Specification" state that <q>no line may be longer than 72 bytes (not
+     * characters), in its utf8-encoded form.</q> but allows for earlier or
      * additional line breaks.
      * <p>
-     * the most important purpose of this test case is probably to make sure
+     * The most important purpose of this test case is probably to make sure
      * that manifest files broken at 70 bytes line width written with the
      * previous version of {@link Manifest} before this fix still work well.
      */
@@ -186,7 +204,8 @@
          * form.
          */
         String lineBrokenSectionName = breakLines(lineWidth, "Name: " + name);
-        String lineBrokenNameAndValue = breakLines(lineWidth, name + ": " + value);
+        String lineBrokenNameAndValue = breakLines(lineWidth, name + ": "
+                    + value);
 
         ByteArrayOutputStream mfBuf = new ByteArrayOutputStream();
         mfBuf.write("Manifest-Version: 1.0".getBytes(UTF_8));
@@ -264,7 +283,7 @@
         return mfBytes;
     }
 
-    private static void printManifest(byte[] mfBytes) {
+    static void printManifest(byte[] mfBytes) {
         final String sepLine = "----------------------------------------------"
                 + "---------------------|-|-|"; // |-positions: ---68-70-72
         System.out.println(sepLine);
@@ -272,13 +291,133 @@
         System.out.println(sepLine);
     }
 
-    private static void assertMainAndSectionValues(Manifest mf, String name,
+    static void assertMainAndSectionValues(Manifest mf, String name,
             String value) {
         String mainValue = mf.getMainAttributes().getValue(name);
-        String sectionValue = mf.getAttributes(name).getValue(name);
-
         assertEquals(value, mainValue, "value different in main section");
+        Attributes attributes = mf.getAttributes(name);
+        assertNotNull(attributes, "named section not found");
+        String sectionValue = attributes.getValue(name);
         assertEquals(value, sectionValue, "value different in named section");
     }
 
+    /**
+     * @see LineBreakCharacter#numByteUtfCharacter(int, int)
+     */
+    static String numByteUtfCharacter(int numBytes, int seed) {
+        seed = seed < 0 ? -seed : seed;
+        if (numBytes == 1) {
+            seed %= 0x5F;
+            seed += 0x20; // exclude control characters (0..0x19) here
+        } else if (numBytes == 2) {
+            seed %= 0x800 - 0x80;
+            seed += 0x80;
+        } else if (numBytes == 3) {
+            seed %= 0x10000 - 0x800 + 0xFDD0 - 0xFDEF + 0xFFFE - 0xFFFF;
+            seed += 0x800
+                    + (seed >= 0xFDD0 ? 0xFDEF - 0xFDD0 : 0) // non-characters
+                    + (seed % 0x10000) * (0xFFFF - 0xFFFE); // byte order marks
+        } else {
+            seed %= 0x110000 - (0x10000 - 0xFFFE);
+            seed += 0x10000
+                    + (seed % 0x10000) * (0xFFFF - 0xFFFE); // byte order marks
+        }
+
+        String string = new String(Character.toChars(seed));
+        assertEquals(string.getBytes(UTF_8).length, numBytes,
+                "self-test failed: unexpected utf encoded character length");
+        return string;
+    }
+
+    /**
+     * Tests that the manifest line content width is between <em>69 (1 +
+     * maximum line width (without line break) 72 - maximum utf encoded
+     * character length 4)</em> and <em>72</em> before continued on the next
+     * line, aka continuation line, depending only on the last character
+     * before the line break and not on those before or after in order to
+     * ensure the minimum number of line breaks and continuation lines
+     * possible.
+     * <p>
+     * The <a href="{@docRoot}/../specs/jar/jar.html#Notes_on_Manifest_and_Signature_Files">
+     * "Notes on Manifest and Signature Files"</a> in the "JAR File
+     * Specification" state that <q>no line may be longer than 72 bytes (not
+     * characters), in its utf8-encoded form.</q> and allows for earlier or
+     * additional line breaks.
+     * <p>
+     * The number of characters a line may be less than the number of bytes
+     * but this is not explicitly verified because it can be assumed if some
+     * characters use more than one byte each and the number of bytes is as
+     * expected up to 72 that fewer than 72 characters were placed on the same
+     * line.
+     * <p>
+     * While {@link #testWriteValidManifestOrException()} tests header names
+     * lengths, this test covers the values.
+     */
+    @Test
+    public void testManifestLineWidthRange() throws Exception {
+        final int MAX_UTF_CHARACTER_ENCODED_LENGTH = 4;
+        final String TEST_NAME = "test";
+
+        // value will be filled with many possible sequences of utf
+        // characters of different encoded numbers of bytes in the form of
+        // "k times i-bytes size character" + "k times i-bytes size character"
+        // and by prepending up to 72 "x" characters at every possible position
+        // relative to a possible line break.
+        String value = "";
+        int seed = 0;
+        for (int i = 1; i <= MAX_UTF_CHARACTER_ENCODED_LENGTH; i++) {
+            for (int j = 1; j <= MAX_UTF_CHARACTER_ENCODED_LENGTH; j++) {
+                String valueI = "", valueJ = "";
+                for (int k = 1; k < MAX_LINE_CONTENT_LENGTH; k++) {
+                    valueI += numByteUtfCharacter(i, seed++);
+                    valueJ += numByteUtfCharacter(j, seed++);
+                    value += valueI + valueJ;
+                }
+            }
+        }
+
+        String offset = "";
+        for (int i = 0; i <= MAX_LINE_CONTENT_LENGTH; i++) {
+            offset += "x";
+            byte[] mfBytes = writeManifest(TEST_NAME, offset + value);
+            Manifest mf = new Manifest(new ByteArrayInputStream(mfBytes));
+            assertMainAndSectionValues(mf, TEST_NAME, offset + value);
+
+            Set<Integer> allowedWidths = IntStream.rangeClosed(
+                    MAX_LINE_CONTENT_LENGTH - MAX_UTF_CHARACTER_ENCODED_LENGTH + 1,
+                    MAX_LINE_CONTENT_LENGTH).boxed().collect(Collectors.toSet());
+            assertEquals(allowedWidths, findContinuedLineWidths(mfBytes));
+        }
+    }
+
+    /**
+     * Counts the length of lines (without the line breaks) which are
+     * continued, not the continuation lines but those before that did not fit
+     * on one line, whereby continuation lines may be continued again on
+     * another line, and returns their length in bytes, not characters.
+     */
+    static Set<Integer> findContinuedLineWidths(byte[] mfBytes) {
+        Set<Integer> widths = new TreeSet<>();
+        int previousLineWidth = -1;
+        int b = 0; // start position of current line
+        for (int i = 0; i < mfBytes.length; i++) {
+            switch (mfBytes[i]) {
+            case '\r':
+            case '\n':
+                previousLineWidth = i - b;
+                if (mfBytes[i] == '\r' &&
+                        i + 1 < mfBytes.length && mfBytes[i + 1] == '\n') {
+                    i++;
+                }
+                b = i + 1;
+                break;
+            case ' ':
+                if (i == b) {
+                    widths.add(previousLineWidth);
+                }
+            }
+        }
+        return widths;
+    }
+
 }
--- old/test/jdk/sun/security/tools/jarsigner/LineBrokenMultiByteCharacter.java	2018-05-04 08:08:08.882301721 +0200
+++ new/test/jdk/sun/security/tools/jarsigner/LineBrokenMultiByteCharacter.java	2018-05-04 08:08:08.658299741 +0200
@@ -21,35 +21,48 @@
  * questions.
  */
 
-/*
- * @test
- * @bug 6695402
- * @summary verify signatures of jars containing classes with names
- *          with multi-byte unicode characters broken across lines
- * @library /test/lib
- */
-
 import java.io.IOException;
 import java.io.InputStream;
 import java.nio.file.Files;
 import java.nio.file.Paths;
-import java.util.jar.JarFile;
+import java.util.HashMap;
 import java.util.jar.Attributes.Name;
+import java.util.jar.JarFile;
 import java.util.jar.JarEntry;
+import java.util.zip.ZipEntry;
+import java.util.zip.ZipFile;
 
 import static java.nio.charset.StandardCharsets.UTF_8;
 
 import jdk.test.lib.SecurityTools;
 import jdk.test.lib.util.JarUtils;
 
+import org.testng.annotations.*;
+import static org.testng.Assert.*;
+
+/**
+ * @test
+ * @bug 6695402
+ * @library /test/lib
+ * @run testng LineBrokenMultiByteCharacter
+ * @summary verify signatures of jars containing classes with names
+ *          with multi-byte unicode characters broken across lines
+ *
+ * Before utf characters were kept in one piece by bug 6443578 and partially
+ * broken across a line break in manifests, bug 6695402 could occur resulting
+ * in a signature considered invalid. That can still happen after bug 6443578
+ * fixed when signing manifests that already contain entries with characters
+ * broken across a line break, presumably when an older manifest is present
+ * created with a different jdk.
+ */
 public class LineBrokenMultiByteCharacter {
 
     /**
-     * this name will break across lines in MANIFEST.MF at the
+     * This name will break across lines in MANIFEST.MF at the
      * middle of a two-byte utf-8 encoded character due to its e acute letter
      * at its exact position.
-     *
-     * because no file with such a name exists {@link JarUtils} will add the
+     * <p>
+     * Because no file with such a name exists {@link JarUtils} will add the
      * name itself as contents to the jar entry which would have contained a
      * compiled class in the original bug. For this test, the contents of the
      * files contained in the jar file is not important as long as they get
@@ -58,55 +71,157 @@
      * @see #verifyClassNameLineBroken(JarFile, String)
      */
     static final String testClassName =
-            "LineBrokenMultiByteCharacterA1234567890B1234567890C123456789D1234\u00E9xyz.class";
+            "LineBrokenMultiByteCharacterA1234567890B1234567890C123456789"
+            + "D1234\u00E9xyz.class";
 
+    /**
+     * Used for a test case where an entry / class file is added to an existing
+     * signed jar that already contains an entry with this name which of course
+     * has to be distinct from the one tested.
+     */
     static final String anotherName =
-            "LineBrokenMultiByteCharacterA1234567890B1234567890C123456789D1234567890.class";
+            "LineBrokenMultiByteCharacterA1234567890B1234567890C123456789"
+            + "D1234567890.class";
 
     static final String alias = "a";
     static final String keystoreFileName = "test.jks";
-    static final String manifestFileName = "MANIFEST.MF";
+    static final String manifestFileName = JarFile.MANIFEST_NAME;
 
-    public static void main(String[] args) throws Exception {
-        prepare();
+    byte[] manifestNoDigest = "Manifest-Version: 1.0\r\n\r\n".getBytes(UTF_8);
+    byte[] manifestWithDigest; // with character broken across line break
 
-        testSignJar("test.jar");
-        testSignJarNoManifest("test-no-manifest.jar");
-        testSignJarUpdate("test-update.jar", "test-updated.jar");
+    /**
+     * Simulate a jar manifest as it would have been created by an earlier jdk
+     * by re-arranging the line break at exactly 72 bytes content thereby
+     * breaking the multi-byte character under test.
+     */
+    static byte[] rebreak(byte[] mf0) {
+        byte[] mf1 = new byte[mf0.length];
+        int c0 = 0, c1 = 0; // bytes since last line start
+        for (int i0 = 0, i1 = 0; i0 < mf0.length; i0++, i1++) {
+            switch (mf0[i0]) {
+            case '\r':
+                if (i0 + 2 < mf0.length &&
+                        mf0[i0 + 1] == '\n' && mf0[i0 + 2] == ' ') {
+                    // skip line break
+                    i0 += 2;
+                    i1 -= 1;
+                } else {
+                    mf1[i1] = mf0[i0];
+                    c0 = c1 = 0;
+                }
+                break;
+            case '\n':
+                if (i0 + 1 < mf0.length && mf0[i0 + 1] == ' ') {
+                    // skip line break
+                    i0 += 1;
+                    i1 -= 1;
+                } else {
+                    mf1[i1] = mf0[i0];
+                    c0 = c1 = 0;
+                }
+                break;
+            case ' ':
+                if (c0 == 0) {
+                    continue;
+                }
+            default:
+                c0++;
+                if (c1 == 72) {
+                    mf1[i1++] = '\r';
+                    mf1[i1++] = '\n';
+                    mf1[i1++] = ' ';
+                    c1 = 1;
+                } else {
+                    c1++;
+                }
+                mf1[i1] = mf0[i0];
+            }
+        }
+        return mf1;
     }
 
-    static void prepare() throws Exception {
+    @BeforeClass
+    public void prepare() throws Exception {
         SecurityTools.keytool("-keystore", keystoreFileName, "-genkeypair",
                 "-storepass", "changeit", "-keypass", "changeit", "-storetype",
                 "JKS", "-alias", alias, "-dname", "CN=X", "-validity", "366")
             .shouldHaveExitValue(0);
 
-        Files.write(Paths.get(manifestFileName), (Name.
-                MANIFEST_VERSION.toString() + ": 1.0\r\n").getBytes(UTF_8));
+        String jarFileName = "reference-manifest.jar";
+        createJarWithManifest(jarFileName, manifestNoDigest,
+                testClassName, anotherName);
+        SecurityTools.jarsigner("-keystore", keystoreFileName, "-storetype",
+                "JKS", "-storepass", "changeit", "-debug", jarFileName, alias)
+            .shouldHaveExitValue(0);
+        try (ZipFile refJar = new ZipFile(jarFileName);) {
+            ZipEntry mfEntry = refJar.getEntry(JarFile.MANIFEST_NAME);
+            manifestWithDigest = refJar.getInputStream(mfEntry).readAllBytes();
+        }
+        System.out.println("manifestWithDigest before re-break = " +
+                new String(manifestWithDigest, UTF_8));
+        manifestWithDigest = rebreak(manifestWithDigest);
+        System.out.println("manifestWithDigest after re-break = " +
+                new String(manifestWithDigest, UTF_8));
+        // now, manifestWithDigest is accepted as unmodified by
+        // jdk.security.jarsigner.JarSigner#updateDigests
+        // (ZipEntry,ZipFile,MessageDigest[],Manifest) on line 985:
+        // "if (!mfDigest.equalsIgnoreCase(base64Digests[i])) {"
+        // and therefore left unchanged when the jar is signed and
+        // signature verification will check it, which is the test case.
+
+        Files.createDirectory(Paths.get("META-INF/"));
+    }
+
+    @Test
+    public void testSignJarWithNoExistingClassEntry() throws Exception {
+        String jarFileName = "test-eacuteinonepiece.jar";
+        createJarWithManifest(jarFileName, manifestNoDigest, testClassName);
+        signAndVerifyJarSignature(jarFileName, false);
     }
 
-    static void testSignJar(String jarFileName) throws Exception {
-        JarUtils.createJar(jarFileName, manifestFileName, testClassName);
-        verifyJarSignature(jarFileName);
+    @Test
+    public void testSignJarWithBrokenEAcuteClassEntry() throws Exception {
+        String jarFileName = "test-brokeneacute.jar";
+        createJarWithManifest(jarFileName, manifestWithDigest, testClassName);
+        signAndVerifyJarSignature(jarFileName, true);
     }
 
-    static void testSignJarNoManifest(String jarFileName) throws Exception {
+    @Test
+    public void testSignJarNoManifest() throws Exception {
+        String jarFileName = "test-no-manifest.jar";
         JarUtils.createJar(jarFileName, testClassName);
-        verifyJarSignature(jarFileName);
+        signAndVerifyJarSignature(jarFileName, false);
     }
 
-    static void testSignJarUpdate(
-            String initialFileName, String updatedFileName) throws Exception {
-        JarUtils.createJar(initialFileName, manifestFileName, anotherName);
+    @Test
+    public void testSignJarUpdateWithEAcuteClassEntryInOnePiece()
+            throws Exception {
+        String initialFileName = "test-eacuteinonepiece-update.jar";
+        String updatedFileName = "test-eacuteinonepiece-updated.jar";
+        createJarWithManifest(initialFileName, manifestNoDigest, anotherName);
         SecurityTools.jarsigner("-keystore", keystoreFileName, "-storetype",
                 "JKS", "-storepass", "changeit", "-debug", initialFileName,
                 alias).shouldHaveExitValue(0);
         JarUtils.updateJar(initialFileName, updatedFileName, testClassName);
-        verifyJarSignature(updatedFileName);
+        signAndVerifyJarSignature(updatedFileName, false);
     }
 
-    static void verifyJarSignature(String jarFileName) throws Exception {
-        // actually sign the jar
+    @Test
+    public void testSignJarUpdateWithBrokenEAcuteClassEntry()
+            throws Exception {
+        String initialFileName = "test-brokeneacute-update.jar";
+        String updatedFileName = "test-brokeneacute-updated.jar";
+        createJarWithManifest(initialFileName, manifestWithDigest, anotherName);
+        SecurityTools.jarsigner("-keystore", keystoreFileName, "-storetype",
+                "JKS", "-storepass", "changeit", "-debug", initialFileName,
+                alias).shouldHaveExitValue(0);
+        JarUtils.updateJar(initialFileName, updatedFileName, testClassName);
+        signAndVerifyJarSignature(updatedFileName, true);
+    }
+
+    void signAndVerifyJarSignature(String jarFileName,
+            boolean expectBrokenEAcute) throws Exception {
         SecurityTools.jarsigner("-keystore", keystoreFileName, "-storetype",
                 "JKS", "-storepass", "changeit", "-debug", jarFileName, alias)
             .shouldHaveExitValue(0);
@@ -114,34 +229,34 @@
         try (
             JarFile jar = new JarFile(jarFileName);
         ) {
-            verifyClassNameLineBroken(jar, testClassName);
+            verifyClassNameLineBroken(jar, testClassName, expectBrokenEAcute);
             verifyCodeSigners(jar, jar.getJarEntry(testClassName));
         }
     }
 
     /**
-     * it would be too easy to miss the actual test case by just renaming an
+     * It would be too easy to miss the actual test case by just renaming an
      * identifier so that the multi-byte encoded character would not any longer
      * be broken across a line break.
      *
-     * this check here verifies that the actual test case is tested based on
+     * This check here verifies that the actual test case is tested based on
      * the manifest and not based on the signature file because at the moment,
      * the signature file does not even contain the desired entry at all.
-     *
-     * this relies on {@link java.util.jar.Manifest} breaking lines unaware
-     * of bytes that belong to the same multi-byte utf characters.
      */
-    static void verifyClassNameLineBroken(JarFile jar, String className)
-            throws IOException {
+    static void verifyClassNameLineBroken(JarFile jar, String className,
+            boolean expectBrokenEAcute) throws IOException {
         byte[] eAcute = "\u00E9".getBytes(UTF_8);
         byte[] eAcuteBroken =
                 new byte[] {eAcute[0], '\r', '\n', ' ', eAcute[1]};
 
-        if (jar.getManifest().getAttributes(className) == null) {
-            throw new AssertionError(className + " not found in manifest");
-        }
+        assertNotNull(jar.getManifest().getAttributes(className),
+                className + " not found in manifest");
 
         JarEntry manifestEntry = jar.getJarEntry(JarFile.MANIFEST_NAME);
+        System.out.println("expectBrokenEAcute = " + expectBrokenEAcute);
+        System.out.println("Manifest = \n" + new String(
+                jar.getInputStream(manifestEntry).readAllBytes(), UTF_8));
+
         try (
             InputStream manifestIs = jar.getInputStream(manifestEntry);
         ) {
@@ -156,10 +271,12 @@
                     bytesMatched = 0;
                 }
             }
-            if (bytesMatched < eAcuteBroken.length) {
-                throw new AssertionError("self-test failed: multi-byte "
-                        + "utf-8 character not broken across lines");
-            }
+            assertEquals(expectBrokenEAcute,
+                    bytesMatched == eAcuteBroken.length,
+                    "self-test failed: multi-byte "
+                    + "utf-8 character "
+                    + (expectBrokenEAcute ? "not " : "")
+                    + "broken across lines");
         }
     }
 
@@ -175,11 +292,16 @@
         // a check for the presence of code signers is sufficient to check
         // bug JDK-6695402. no need to also verify the actual code signers
         // attributes here.
-        if (jarEntry.getCodeSigners() == null
-                || jarEntry.getCodeSigners().length == 0) {
-            throw new AssertionError(
-                    "no signing certificate found for " + jarEntry.getName());
-        }
+        assertTrue(jarEntry.getCodeSigners() != null
+                && jarEntry.getCodeSigners().length > 0,
+                "no signing certificate found for " + jarEntry.getName());
+    }
+
+    static void createJarWithManifest(String jarFileName, byte[] manifest,
+            String... files) throws IOException {
+        JarUtils.createJar("yetwithoutmanifest-" + jarFileName, files);
+        JarUtils.updateJar("yetwithoutmanifest-" + jarFileName, jarFileName,
+                new HashMap<>() {{ put(manifestFileName, manifest); }});
     }
 
 }

Re: RFR for 6443578: Continuation lines in JAR manifests do not follow RFC-822

Reply via email to