Hi Gerd

I've had a go at display SrdDisplay so it reproduces output close to
the mkgmap resource/sort/cp*.txt files

Patch attached

I've also attached the output from the Turkish 00000848.SRT you sent
and there are quite a few differences from ours. We should consider
what to do with the version numbering (id2 I presume).

There is something in the expansion flags that appears to control which
secondary/tertiary variant should be selected and I haven't bothered
with this.

Ticker

On Fri, 2021-12-10 at 13:37 +0000, Gerd Petermann wrote:
> Hi Ticker,
> 
> attached is the extracted *.srt.
> The original link to the turkey download posted here no longer works:
> https://www.mkgmap.org.uk/pipermail/mkgmap-dev/2017q2/026715.html
> 
> 
> Gerd
> 
> ________________________________________
> Von: mkgmap-dev <mkgmap-dev-boun...@lists.mkgmap.org.uk> im Auftrag
> von Ticker Berkin <rwb-mkg...@jagit.co.uk>
> Gesendet: Freitag, 10. Dezember 2021 10:37
> An: Development list for mkgmap
> Betreff: Re: [mkgmap-dev] Error in MdrCheck?
> 
> Hi Gerd
> 
> Working on the basis of guessing and resources/sort/README, we
> shouldn't use the same id2 if our sort is different to one from
> Garmin
> (or elsewhere). A device will have a base-map that defines the sort
> it
> needs, represented by id1/id2. Addition maps shouldn't use the same
> pair to represent a different sort. Maybe we should change id2 for
> all
> our maps to be some arbitrary higher number, or certainly do this
> when
> a conflict is spotted.
> 
> Looking at the SrtDisplay "Summary of ordering" output, it should be
> possible to hack the code a bit or edit the output to get back to
> what
> our sort tables look like. Assuming as the ? problem can be fixed,
> the
> significant question is what is the meaning of the lowest sortOrder.
> In
> our tables, everything before the first "<" gets zero and doesn't
> contribute to the ordering, along with anything not defined.
> SrtDisplay
> puts everything after the first "<".
> 
> Can you sent me the Turkish .SRT subfile and I'll have a look.
> 
> Ticker
> 
> 
> On Fri, 2021-12-10 at 08:15 +0000, Gerd Petermann wrote:
> > Hi Ticker,
> > 
> > Both have the same ids:
> > 00000044 | 000002 | 0e 00                   | id1 14
> > 00000046 | 000004 | 01 00                   | id2 1
> > 00000048 | 000006 | e6 04                   | codepage 1254
> > 
> > reg. SrtDisplay:
> > Our file looks very different compared to the "Summary of ordering"
> > report. I don't understand most of the details, and for sure I
> > don't
> > know which one is better.
> > I think the summary cannot be used as input for mkgmap because it
> > contains several '?' where characters coulnd't be converted to
> > unicode.
> > (same problem when I create a map with --codepage=1252 and use
> > SrtDisplay on that.
> > 
> > Gerd
> > 
> > ________________________________________
> > Von: mkgmap-dev <mkgmap-dev-boun...@lists.mkgmap.org.uk> im Auftrag
> > von Ticker Berkin <rwb-mkg...@jagit.co.uk>
> > Gesendet: Donnerstag, 9. Dezember 2021 12:51
> > An: Development list for mkgmap
> > Betreff: Re: [mkgmap-dev] Error in MdrCheck?
> > 
> > Hi Gerd
> > 
> > The alternative would be to use test.display.SrtDisplay to generate
> > a
> > different version of our resources/sort/cp1254.txt  that matches
> > theirs, or maybe have versions findable by id1/id2 that match.
> > 
> > Ticker
> > 
> > 
> > On Thu, 2021-12-09 at 09:09 +0000, Gerd Petermann wrote:
> > > Hi devs,
> > > 
> > > I think there is a bug in MdrCheck, probably also in other Check
> > > programs. The program doesn't read the SRT file content from the
> > > map,
> > > instead it uses the corresponding data from mkgmap.
> > > If the builtin sort order in mkgmap doesn't match the SRT file
> > > content the program will report errors about wrong order or
> > > missing
> > > repeat flags etc.
> > > I guess this explains why MdrCheck complains about the Garmin
> > > demo
> > > map for Turkey?
> > > 
> > > I once started to implement a SrtFileReader but I don't know if
> > > that
> > > can be used instead.
> > > 
> > > Gerd
> > > 
> > > 
> > > _______________________________________________
> > > mkgmap-dev mailing list
> > > mkgmap-dev@lists.mkgmap.org.uk
> > > https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
> > 
> > 
> > _______________________________________________
> > mkgmap-dev mailing list
> > mkgmap-dev@lists.mkgmap.org.uk
> > https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
> > _______________________________________________
> > mkgmap-dev mailing list
> > mkgmap-dev@lists.mkgmap.org.uk
> > https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
> 
> 
> _______________________________________________
> mkgmap-dev mailing list
> mkgmap-dev@lists.mkgmap.org.uk
> https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
> _______________________________________________
> mkgmap-dev mailing list
> mkgmap-dev@lists.mkgmap.org.uk
> https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

# Compare this with resource/sort/cp1254.txt.

codepage 1254
id1 14
id2 1
description "Turkish Sort"

characters

=0008=0009=000a=000b=000c=000d=000e=000f=0010=0011=0012=0013=0014=0015=0016=0017=0018=0019=001a=001b=001c=001d=007f,0001,0002,0003,0004,0005,0006,0007
 < 0020,00a0,001e,001f
 < `
 < ´
 < ˜
 < ^
 < ¯
 < ¨
 < ¸
 < _
 < 00ad
 < -
 < –
 < —
 < 002c
 < 003b
 < :
 < !
 < ¡
 < ?
 < ¿
 < .
 < ·
 < '
 < ‘
 < ’
 < ‚
 < ‹
 < ›
 < "
 < “
 < ”
 < „
 < «
 < »
 < (
 < )
 < [
 < ]
 < {
 < }
 < §
 < ¶
 < ©
 < ®
 < @
 < *
 < /
 < \
 < &
 < 0023
 < %
 < ‰
 < †
 < ‡
 < •
 < ˆ
 < °
 < +
 < ±
 < ÷
 < ×
 < 003c
 < 003d
 < >
 < ¬
 < |
 < ¦
 < ~
 < ¤
 < ¢
 < $
 < £
 < ¥
 < €
 < 0
 < 1,¹
 < 2,²
 < 3,³
 < 4
 < 5
 < 6
 < 7
 < 8
 < 9
 < a,A,ª ; á,Á ; à,À ; â, ; å,Å ; ä,Ä ; ã,Ã
 < b,B
 < c,C ; ç,Ç
 < d,D
 < e,E ; é,É ; è,È ; ê,Ê ; ë,Ë
 < f,F
 < ƒ
 < g,G ; ğ,Ğ
 < h,H
 < i,I ; í,Í ; ì,Ì ; î,Î ; ï,Ï
 < ı ; İ
 < j,J
 < k,K
 < l,L
 < m,M
 < n,N ; Ñ=ñ
 < o,O,º ; ó,Ó ; ò,Ò ; ô,Ô ; ö,Ö ; õ,Õ ; ø,Ø
 < p,P
 < q,Q
 < r,R
 < s,S ; š,Š ; ş,Ş
 < t,T
 < u,U ; ú,Ú ; ù,Ù ; û,Û ; ü,Ü
 < v,V
 < w,W
 < x,X
 < y,Y ; ÿ,Ÿ
 < z,Z
 < µ
expand … to  . . .
expand Œ to  O E
expand ™ to  T M
expand œ to  O E
expand ¼ to  ¹ / 4
expand ½ to  ¹ / ²
expand ¾ to  ³ / 4
expand Æ to  A E
expand ß to  S S
expand æ to  A E

# to here
Index: src/test/display/SrtDisplay.java
===================================================================
--- src/test/display/SrtDisplay.java	(revision 571)
+++ src/test/display/SrtDisplay.java	(working copy)
@@ -57,6 +57,11 @@
 
 	private final Map<Integer, Integer> offsetToBlock = new HashMap<>();
 
+	private String srtDescription;
+	private int codepage;
+	private int id1;
+	private int id2;
+
 	protected void print() {
 		readCommonHeader();
 		readFileHeader();
@@ -119,9 +124,9 @@
 
 		d.setTitle("Description");
 
-		String s = d.zstringValue("Description: %s");
+		srtDescription = d.zstringValue("Description: %s");
 
-		long remain = description.getLen() - s.length() - 1;
+		long remain = description.getLen() - srtDescription.length() - 1;
 		d.rawValue((int) remain);
 
 		d.print(outStream);
@@ -138,10 +143,10 @@
 		d.setSectStart(start);
 		reader.position(start);
 		int len = d.charValue("sub header len %d");
-		d.charValue("id1 %d");
-		d.charValue("id2 %d");
+		id1 = d.charValue("id1 %d");
+		id2 = d.charValue("id2 %d");
 
-		int codepage = d.charValue("codepage %d");
+		codepage = d.charValue("codepage %d");
 		if (codepage == 65001)
 			isUnicode = true;
 		Charset charset = Sort.charsetFromCodepage(codepage);
@@ -206,38 +211,79 @@
 		d.setTitle("------- Summary of ordering --------");
 
 		Formatter chars = new Formatter();
-		Formatter comment = new Formatter();
+		//Formatter comment = new Formatter();
+		
+		// reproduce header like mkgmap resource/sort/cp*.txt entries
+		chars.format("\n\n\n");
+		chars.format("# Compare this with resource/sort/cp%d.txt.\n\n", codepage);
+		chars.format("codepage %d\n", codepage);
+		chars.format("id1 %d\n", id1);
+		chars.format("id2 %d\n", id2);
+		chars.format("description \"%s\"\n\n", srtDescription);
+		chars.format("characters\n\n");
+
 		CharPosition last = new CharPosition(0);
-		last.first = -1;
+		//last.first = -1;
+		last.first = 0; // start first line with zero/ignore sortOrder
 		for (CharPosition cp : charmap) {
-			if (cp.expands)
+			if (cp.expands > 0)
 				continue;
+			int unicodeChar = toUnicode(cp.val);
+			if (unicodeChar < 0) // no character defined for this position
+				continue;
 
 			if (cp.first != last.first) {
 				//chars.format("    # %s\n[%d] < ", comment, cp.first);
-				chars.format("\n< ");
-				comment = new Formatter();
+				chars.format("\n < ");
+				//comment = new Formatter();
 			} else if (cp.second != last.second) {
 				chars.format(" ; ");
-				comment.format(" ; ");
+				//comment.format(" ; ");
 			} else if (cp.third != last.third) {
 				chars.format(",");
-				comment.format(",");
+				//comment.format(",");
 			} else {
 				chars.format("=");
-				comment.format("=");
+				//comment.format("=");
 			}
 			last = cp;
-			chars.format("%s", fmtChar(toUnicode(cp.val)));
-			comment.format("U+%04x", cp.val);
+			chars.format("%s", fmtChar(unicodeChar));
+			//comment.format("U+%04x", cp.val);
 		}
 
 		chars.format("\n");
 		for (CharPosition cp : charmap) {
-			if (cp.expands)
-				continue;
-			chars.format("%4s %s\n", fmtChar(toUnicode(cp.val)), cp);
+			if (cp.expands > 0) {
+				chars.format("expand %s to ", fmtChar(toUnicode(cp.val)));
+				for (int i = 0; i <= cp.expands; ++i) {
+					CharPosition ch = expansions.get(cp.first + i - 1);
+					// need to search for best char with this first/primary
+					// don't know what the meaning is of vals in second/third (eg sec=8,tert=3)
+					// %%% but they do should have more effect on the the entry selected
+					int charValue = -1;
+					for (CharPosition scanCp : charmap) {
+						if (scanCp.expands > 0)
+							continue;
+						if (scanCp.first == ch.first) {
+							if (scanCp.second == 1 && scanCp.third == 2) { // try and get upper case %%%
+								charValue = scanCp.val;
+								break;
+							} else if (charValue < 0) {
+								charValue = scanCp.val;
+							}
+						}
+					}
+					if (charValue >= 0)
+						charValue = toUnicode(charValue);
+					if (charValue >= 0)
+						chars.format(" %c", charValue);
+				}
+				chars.format("\n");
+			}
+			// else
+			//chars.format("%4s %s\n", fmtChar(toUnicode(cp.val)), cp);
 		}
+		chars.format("\n# to here\n", codepage);
 
 		d.item().addText(chars.toString());
 		d.print(outStream);
@@ -286,7 +332,11 @@
 		StringBuilder sb = new StringBuilder();
 		Formatter fmt = new Formatter(sb);
 		fmt.format("0x%02x ", charValue);
-		fmt.format("(%c) ", toUnicode(charValue));
+		int unicodeChar = toUnicode(charValue);
+		if (unicodeChar < 0) // no character defined for this position
+			fmt.format("NaC ");
+		else
+			fmt.format("(%c) ", unicodeChar);
 		if ((flags & 0x1) != 0)
 			sb.append("Letter ");
 		if ((flags & 0x2) != 0)
@@ -297,8 +347,8 @@
 		} else {
 			// This is an expansion, it sorts as two or more characters (eg ß sorts near ss).
 			// The pos is an index into srt5.
-			c.expands = true;
-			expansion(sb, c.first, (flags >> 4) & 0xf);
+			c.expands = (flags >> 4) & 0xf;
+			expansion(sb, c.first, c.expands);
 		}
 
 		item.addText(sb.toString());
@@ -373,7 +423,7 @@
 			CharBuffer chars = decoder.decode(b);
 			return chars.charAt(0);
 		} catch (CharacterCodingException e) {
-			return '?';
+			return -1;
 		}
 	}
 
@@ -472,7 +522,7 @@
 		private int first;
 		private int second;
 		private int third;
-		private boolean expands;
+		private int expands;
 
 		public CharPosition(int charValue) {
 			this.val = charValue;
_______________________________________________
mkgmap-dev mailing list
mkgmap-dev@lists.mkgmap.org.uk
https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

Reply via email to