Hi Gerd I've had a go at display SrdDisplay so it reproduces output close to the mkgmap resource/sort/cp*.txt files
Patch attached I've also attached the output from the Turkish 00000848.SRT you sent and there are quite a few differences from ours. We should consider what to do with the version numbering (id2 I presume). There is something in the expansion flags that appears to control which secondary/tertiary variant should be selected and I haven't bothered with this. Ticker On Fri, 2021-12-10 at 13:37 +0000, Gerd Petermann wrote: > Hi Ticker, > > attached is the extracted *.srt. > The original link to the turkey download posted here no longer works: > https://www.mkgmap.org.uk/pipermail/mkgmap-dev/2017q2/026715.html > > > Gerd > > ________________________________________ > Von: mkgmap-dev <mkgmap-dev-boun...@lists.mkgmap.org.uk> im Auftrag > von Ticker Berkin <rwb-mkg...@jagit.co.uk> > Gesendet: Freitag, 10. Dezember 2021 10:37 > An: Development list for mkgmap > Betreff: Re: [mkgmap-dev] Error in MdrCheck? > > Hi Gerd > > Working on the basis of guessing and resources/sort/README, we > shouldn't use the same id2 if our sort is different to one from > Garmin > (or elsewhere). A device will have a base-map that defines the sort > it > needs, represented by id1/id2. Addition maps shouldn't use the same > pair to represent a different sort. Maybe we should change id2 for > all > our maps to be some arbitrary higher number, or certainly do this > when > a conflict is spotted. > > Looking at the SrtDisplay "Summary of ordering" output, it should be > possible to hack the code a bit or edit the output to get back to > what > our sort tables look like. Assuming as the ? problem can be fixed, > the > significant question is what is the meaning of the lowest sortOrder. > In > our tables, everything before the first "<" gets zero and doesn't > contribute to the ordering, along with anything not defined. > SrtDisplay > puts everything after the first "<". > > Can you sent me the Turkish .SRT subfile and I'll have a look. > > Ticker > > > On Fri, 2021-12-10 at 08:15 +0000, Gerd Petermann wrote: > > Hi Ticker, > > > > Both have the same ids: > > 00000044 | 000002 | 0e 00 | id1 14 > > 00000046 | 000004 | 01 00 | id2 1 > > 00000048 | 000006 | e6 04 | codepage 1254 > > > > reg. SrtDisplay: > > Our file looks very different compared to the "Summary of ordering" > > report. I don't understand most of the details, and for sure I > > don't > > know which one is better. > > I think the summary cannot be used as input for mkgmap because it > > contains several '?' where characters coulnd't be converted to > > unicode. > > (same problem when I create a map with --codepage=1252 and use > > SrtDisplay on that. > > > > Gerd > > > > ________________________________________ > > Von: mkgmap-dev <mkgmap-dev-boun...@lists.mkgmap.org.uk> im Auftrag > > von Ticker Berkin <rwb-mkg...@jagit.co.uk> > > Gesendet: Donnerstag, 9. Dezember 2021 12:51 > > An: Development list for mkgmap > > Betreff: Re: [mkgmap-dev] Error in MdrCheck? > > > > Hi Gerd > > > > The alternative would be to use test.display.SrtDisplay to generate > > a > > different version of our resources/sort/cp1254.txt that matches > > theirs, or maybe have versions findable by id1/id2 that match. > > > > Ticker > > > > > > On Thu, 2021-12-09 at 09:09 +0000, Gerd Petermann wrote: > > > Hi devs, > > > > > > I think there is a bug in MdrCheck, probably also in other Check > > > programs. The program doesn't read the SRT file content from the > > > map, > > > instead it uses the corresponding data from mkgmap. > > > If the builtin sort order in mkgmap doesn't match the SRT file > > > content the program will report errors about wrong order or > > > missing > > > repeat flags etc. > > > I guess this explains why MdrCheck complains about the Garmin > > > demo > > > map for Turkey? > > > > > > I once started to implement a SrtFileReader but I don't know if > > > that > > > can be used instead. > > > > > > Gerd > > > > > > > > > _______________________________________________ > > > mkgmap-dev mailing list > > > mkgmap-dev@lists.mkgmap.org.uk > > > https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev > > > > > > _______________________________________________ > > mkgmap-dev mailing list > > mkgmap-dev@lists.mkgmap.org.uk > > https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev > > _______________________________________________ > > mkgmap-dev mailing list > > mkgmap-dev@lists.mkgmap.org.uk > > https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev > > > _______________________________________________ > mkgmap-dev mailing list > mkgmap-dev@lists.mkgmap.org.uk > https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev > _______________________________________________ > mkgmap-dev mailing list > mkgmap-dev@lists.mkgmap.org.uk > https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
# Compare this with resource/sort/cp1254.txt. codepage 1254 id1 14 id2 1 description "Turkish Sort" characters =0008=0009=000a=000b=000c=000d=000e=000f=0010=0011=0012=0013=0014=0015=0016=0017=0018=0019=001a=001b=001c=001d=007f,0001,0002,0003,0004,0005,0006,0007 < 0020,00a0,001e,001f < ` < ´ < ˜ < ^ < ¯ < ¨ < ¸ < _ < 00ad < - < – < — < 002c < 003b < : < ! < ¡ < ? < ¿ < . < · < ' < ‘ < ’ < ‚ < ‹ < › < " < “ < ” < „ < « < » < ( < ) < [ < ] < { < } < § < ¶ < © < ® < @ < * < / < \ < & < 0023 < % < ‰ < † < ‡ < • < ˆ < ° < + < ± < ÷ < × < 003c < 003d < > < ¬ < | < ¦ < ~ < ¤ < ¢ < $ < £ < ¥ < € < 0 < 1,¹ < 2,² < 3,³ < 4 < 5 < 6 < 7 < 8 < 9 < a,A,ª ; á,Á ; à,À ; â, ; å,Å ; ä,Ä ; ã,à < b,B < c,C ; ç,Ç < d,D < e,E ; é,É ; è,È ; ê,Ê ; ë,Ë < f,F < ƒ < g,G ; ğ,Ğ < h,H < i,I ; í,Í ; ì,Ì ; î,Î ; ï,Ï < ı ; İ < j,J < k,K < l,L < m,M < n,N ; Ñ=ñ < o,O,º ; ó,Ó ; ò,Ò ; ô,Ô ; ö,Ö ; õ,Õ ; ø,Ø < p,P < q,Q < r,R < s,S ; š,Š ; ş,Ş < t,T < u,U ; ú,Ú ; ù,Ù ; û,Û ; ü,Ü < v,V < w,W < x,X < y,Y ; ÿ,Ÿ < z,Z < µ expand … to . . . expand Œ to O E expand ™ to T M expand œ to O E expand ¼ to ¹ / 4 expand ½ to ¹ / ² expand ¾ to ³ / 4 expand Æ to A E expand ß to S S expand æ to A E # to here
Index: src/test/display/SrtDisplay.java =================================================================== --- src/test/display/SrtDisplay.java (revision 571) +++ src/test/display/SrtDisplay.java (working copy) @@ -57,6 +57,11 @@ private final Map<Integer, Integer> offsetToBlock = new HashMap<>(); + private String srtDescription; + private int codepage; + private int id1; + private int id2; + protected void print() { readCommonHeader(); readFileHeader(); @@ -119,9 +124,9 @@ d.setTitle("Description"); - String s = d.zstringValue("Description: %s"); + srtDescription = d.zstringValue("Description: %s"); - long remain = description.getLen() - s.length() - 1; + long remain = description.getLen() - srtDescription.length() - 1; d.rawValue((int) remain); d.print(outStream); @@ -138,10 +143,10 @@ d.setSectStart(start); reader.position(start); int len = d.charValue("sub header len %d"); - d.charValue("id1 %d"); - d.charValue("id2 %d"); + id1 = d.charValue("id1 %d"); + id2 = d.charValue("id2 %d"); - int codepage = d.charValue("codepage %d"); + codepage = d.charValue("codepage %d"); if (codepage == 65001) isUnicode = true; Charset charset = Sort.charsetFromCodepage(codepage); @@ -206,38 +211,79 @@ d.setTitle("------- Summary of ordering --------"); Formatter chars = new Formatter(); - Formatter comment = new Formatter(); + //Formatter comment = new Formatter(); + + // reproduce header like mkgmap resource/sort/cp*.txt entries + chars.format("\n\n\n"); + chars.format("# Compare this with resource/sort/cp%d.txt.\n\n", codepage); + chars.format("codepage %d\n", codepage); + chars.format("id1 %d\n", id1); + chars.format("id2 %d\n", id2); + chars.format("description \"%s\"\n\n", srtDescription); + chars.format("characters\n\n"); + CharPosition last = new CharPosition(0); - last.first = -1; + //last.first = -1; + last.first = 0; // start first line with zero/ignore sortOrder for (CharPosition cp : charmap) { - if (cp.expands) + if (cp.expands > 0) continue; + int unicodeChar = toUnicode(cp.val); + if (unicodeChar < 0) // no character defined for this position + continue; if (cp.first != last.first) { //chars.format(" # %s\n[%d] < ", comment, cp.first); - chars.format("\n< "); - comment = new Formatter(); + chars.format("\n < "); + //comment = new Formatter(); } else if (cp.second != last.second) { chars.format(" ; "); - comment.format(" ; "); + //comment.format(" ; "); } else if (cp.third != last.third) { chars.format(","); - comment.format(","); + //comment.format(","); } else { chars.format("="); - comment.format("="); + //comment.format("="); } last = cp; - chars.format("%s", fmtChar(toUnicode(cp.val))); - comment.format("U+%04x", cp.val); + chars.format("%s", fmtChar(unicodeChar)); + //comment.format("U+%04x", cp.val); } chars.format("\n"); for (CharPosition cp : charmap) { - if (cp.expands) - continue; - chars.format("%4s %s\n", fmtChar(toUnicode(cp.val)), cp); + if (cp.expands > 0) { + chars.format("expand %s to ", fmtChar(toUnicode(cp.val))); + for (int i = 0; i <= cp.expands; ++i) { + CharPosition ch = expansions.get(cp.first + i - 1); + // need to search for best char with this first/primary + // don't know what the meaning is of vals in second/third (eg sec=8,tert=3) + // %%% but they do should have more effect on the the entry selected + int charValue = -1; + for (CharPosition scanCp : charmap) { + if (scanCp.expands > 0) + continue; + if (scanCp.first == ch.first) { + if (scanCp.second == 1 && scanCp.third == 2) { // try and get upper case %%% + charValue = scanCp.val; + break; + } else if (charValue < 0) { + charValue = scanCp.val; + } + } + } + if (charValue >= 0) + charValue = toUnicode(charValue); + if (charValue >= 0) + chars.format(" %c", charValue); + } + chars.format("\n"); + } + // else + //chars.format("%4s %s\n", fmtChar(toUnicode(cp.val)), cp); } + chars.format("\n# to here\n", codepage); d.item().addText(chars.toString()); d.print(outStream); @@ -286,7 +332,11 @@ StringBuilder sb = new StringBuilder(); Formatter fmt = new Formatter(sb); fmt.format("0x%02x ", charValue); - fmt.format("(%c) ", toUnicode(charValue)); + int unicodeChar = toUnicode(charValue); + if (unicodeChar < 0) // no character defined for this position + fmt.format("NaC "); + else + fmt.format("(%c) ", unicodeChar); if ((flags & 0x1) != 0) sb.append("Letter "); if ((flags & 0x2) != 0) @@ -297,8 +347,8 @@ } else { // This is an expansion, it sorts as two or more characters (eg ß sorts near ss). // The pos is an index into srt5. - c.expands = true; - expansion(sb, c.first, (flags >> 4) & 0xf); + c.expands = (flags >> 4) & 0xf; + expansion(sb, c.first, c.expands); } item.addText(sb.toString()); @@ -373,7 +423,7 @@ CharBuffer chars = decoder.decode(b); return chars.charAt(0); } catch (CharacterCodingException e) { - return '?'; + return -1; } } @@ -472,7 +522,7 @@ private int first; private int second; private int third; - private boolean expands; + private int expands; public CharPosition(int charValue) { this.val = charValue;
_______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev