Re: [mkgmap-dev] Fix and augment sort definitions
Hi Gerd The same fix was applied to cp1252.txt some time ago and this patch fixes the rest of the sort/cp*.txt in the same way - making them do what the writer intended. I don't think a unit test for this is worth the time or effort. Ticker On Thu, 2023-10-12 at 08:14 +, Gerd Petermann wrote: > > Hi Ticker, > > > > please can you provide a unit test for this? > > > > Gerd > > > > > > Von: mkgmap-dev im Auftrag von > > Ticker > > Berkin > > Gesendet: Donnerstag, 12. Oktober 2023 09:18 > > An: Development list for mkgmap > > Betreff: Re: [mkgmap-dev] Fix and augment sort definitions > > > > Hi Gerd > > > > Last year it was reported that string ordering with the '#' character was > > incorrect. This was > > because, in the sort/cp*.txt files, the relevant line with the '#' was taken > > as a comment. > > > > I had a patch that fixed all the files, but it also attempted to do more > > with > > ß/ss and > > dipthongs. > > > > I've done another patch that doesn't have any contentious changes, just > > fixes > > the #, makes the > > layout consistent between the files, increments the version/id2 values and > > slight improvements > > to the documentation. > > > > Ticker > > > > On Tue, 2022-01-11 at 14:00 +, Gerd Petermann wrote: > > > > > > Hi Ticker, > > > > > > > > > > > > if you don't mind I'd like to postpone this patch until the active > > > > > > branches are merged into > > > > > > trunk. > > > > > > > > > > > > Gerd > > > > > > > > > > > > > > > > > > > > > > > > Von: mkgmap-dev im Auftrag > > > > > > von > > > > > > Ticker Berkin > > > > > > > > > > > > Gesendet: Dienstag, 11. Januar 2022 11:25 > > > > > > An: Development list for mkgmap > > > > > > Betreff: Re: [mkgmap-dev] Fix and augment sort definitions > > > > > > > > > > > > Hi Gerd > > > > > > > > > > > > Yes - gmapsupp builder gives a warning if id1/id2 are not consistent > > > > > > in > > > > > > all the .img files. It is just a warning and gmapsupp is built > > > > > > anyway > > > > > > and I think the warning can be ignored. gmapi doesn't notice. > > > > > > > > > > > > Almost all of the significant sorting where the Garmin device... > > > > > > needs > > > > > > to know the sort details happens in Mdr, so this isn't a problem. > > > > > > > > > > > > Other uses are mostly for de-duping/efficient processing, so these > > > > > > shouldn't matter either. > > > > > > > > > > > > However the LBL file does hold id1/id2 and many sections (Countries, > > > > > > Regions, Cities, Zips, POIs) are sorted so the effect here is > > > > > > unknown. > > > > > > > > > > > > If using --latin2 / 1252, the only change in ordering is around > > > > > > AE/OE > > > > > > dipthongs. > > > > > > > > > > > > Within the same commit or build as sortResource_v2, the attached > > > > > > sortMashExp.patch should be applied, as it effects the binary SRT > > > > > > file > > > > > > and I don't want to increment all the id2's again. This patch > > > > > > changes > > > > > > the sort.expand TERTIARY mashing from 2 to 3, which is slightly more > > > > > > consistent with the Garmin SRT binaries I've seen and allows > > > > > > SrtDisplay > > > > > > to show expansions with what looks like a meaningful case. > > > > > > > > > > > > Ticker > > > > > > > > > > > > On Tue, 2022-01-11 at 06:31 +, Gerd Petermann wrote: > > > > > > > > > > Hi Ticker, > > > > > > > > > > > > > > > > > > > > didn't try it: Will mkgmap complain when building an indexed > > > > > > > > > > gmapi/gmapsupp > > > > > > > > > > where some tiles where freshly c
Re: [mkgmap-dev] Fix and augment sort definitions
Hi Ticker, please can you provide a unit test for this? Gerd Von: mkgmap-dev im Auftrag von Ticker Berkin Gesendet: Donnerstag, 12. Oktober 2023 09:18 An: Development list for mkgmap Betreff: Re: [mkgmap-dev] Fix and augment sort definitions Hi Gerd Last year it was reported that string ordering with the '#' character was incorrect. This was because, in the sort/cp*.txt files, the relevant line with the '#' was taken as a comment. I had a patch that fixed all the files, but it also attempted to do more with ß/ss and dipthongs. I've done another patch that doesn't have any contentious changes, just fixes the #, makes the layout consistent between the files, increments the version/id2 values and slight improvements to the documentation. Ticker On Tue, 2022-01-11 at 14:00 +, Gerd Petermann wrote: > > Hi Ticker, > > > > if you don't mind I'd like to postpone this patch until the active branches > > are merged into > > trunk. > > > > Gerd > > > > > > > > Von: mkgmap-dev im Auftrag von > > Ticker Berkin > > > > Gesendet: Dienstag, 11. Januar 2022 11:25 > > An: Development list for mkgmap > > Betreff: Re: [mkgmap-dev] Fix and augment sort definitions > > > > Hi Gerd > > > > Yes - gmapsupp builder gives a warning if id1/id2 are not consistent in > > all the .img files. It is just a warning and gmapsupp is built anyway > > and I think the warning can be ignored. gmapi doesn't notice. > > > > Almost all of the significant sorting where the Garmin device... needs > > to know the sort details happens in Mdr, so this isn't a problem. > > > > Other uses are mostly for de-duping/efficient processing, so these > > shouldn't matter either. > > > > However the LBL file does hold id1/id2 and many sections (Countries, > > Regions, Cities, Zips, POIs) are sorted so the effect here is unknown. > > > > If using --latin2 / 1252, the only change in ordering is around AE/OE > > dipthongs. > > > > Within the same commit or build as sortResource_v2, the attached > > sortMashExp.patch should be applied, as it effects the binary SRT file > > and I don't want to increment all the id2's again. This patch changes > > the sort.expand TERTIARY mashing from 2 to 3, which is slightly more > > consistent with the Garmin SRT binaries I've seen and allows SrtDisplay > > to show expansions with what looks like a meaningful case. > > > > Ticker > > > > On Tue, 2022-01-11 at 06:31 +, Gerd Petermann wrote: > > > > Hi Ticker, > > > > > > > > didn't try it: Will mkgmap complain when building an indexed > > > > gmapi/gmapsupp > > > > where some tiles where freshly compiled with the new version and > > > > others with > > > > an older (like Felix and Carlos do)? > > > > > > > > Gerd > > > > > > > > > > > > Von: mkgmap-dev im Auftrag > > > > von Ticker Berkin > > > > Gesendet: Montag, 10. Januar 2022 12:04 > > > > An: Development list for mkgmap > > > > Betreff: Re: [mkgmap-dev] Fix and augment sort definitions > > > > > > > > Hi Gerd > > > > > > > > What I meant was that keyboards/devices don't normally have ways of > > > > entering the single chars "…", "¼", "½", "¾", "™". > > > > > > > > Names with these might be presented by Garmin software after some > > > > initial chars have been entered and you can then select the complete > > > > name that contains these chars. > > > > > > > > I didn't see a good reason to remove the expand for these and find > > > > some > > > > arbitrary sort PRIMARY for them. No one has complained about them. > > > > Also > > > > cp65001 had over 1000 expands and I really don't want to start > > > > touching > > > > these. > > > > > > > > Ticker > > > > > > > > > > > > On Mon, 2022-01-10 at 10:29 +, Gerd Petermann wrote: > > > > > > Hi Ticker, > > > > > > > > > > > > I've committed displaySrt_v2.patch . > > > > > > > > > > > > I don't fully understand the comment > > > > > > "Leave the above because no method of inputting them anyway and > > > > > > unlikely at start of names
Re: [mkgmap-dev] Fix and augment sort definitions
Hi Gerd Last year it was reported that string ordering with the '#' character was incorrect. This was because, in the sort/cp*.txt files, the relevant line with the '#' was taken as a comment. I had a patch that fixed all the files, but it also attempted to do more with ß/ss and dipthongs. I've done another patch that doesn't have any contentious changes, just fixes the #, makes the layout consistent between the files, increments the version/id2 values and slight improvements to the documentation. Ticker On Tue, 2022-01-11 at 14:00 +, Gerd Petermann wrote: > > Hi Ticker, > > > > if you don't mind I'd like to postpone this patch until the active branches > > are merged into > > trunk. > > > > Gerd > > > > > > > > Von: mkgmap-dev im Auftrag von > > Ticker Berkin > > > > Gesendet: Dienstag, 11. Januar 2022 11:25 > > An: Development list for mkgmap > > Betreff: Re: [mkgmap-dev] Fix and augment sort definitions > > > > Hi Gerd > > > > Yes - gmapsupp builder gives a warning if id1/id2 are not consistent in > > all the .img files. It is just a warning and gmapsupp is built anyway > > and I think the warning can be ignored. gmapi doesn't notice. > > > > Almost all of the significant sorting where the Garmin device... needs > > to know the sort details happens in Mdr, so this isn't a problem. > > > > Other uses are mostly for de-duping/efficient processing, so these > > shouldn't matter either. > > > > However the LBL file does hold id1/id2 and many sections (Countries, > > Regions, Cities, Zips, POIs) are sorted so the effect here is unknown. > > > > If using --latin2 / 1252, the only change in ordering is around AE/OE > > dipthongs. > > > > Within the same commit or build as sortResource_v2, the attached > > sortMashExp.patch should be applied, as it effects the binary SRT file > > and I don't want to increment all the id2's again. This patch changes > > the sort.expand TERTIARY mashing from 2 to 3, which is slightly more > > consistent with the Garmin SRT binaries I've seen and allows SrtDisplay > > to show expansions with what looks like a meaningful case. > > > > Ticker > > > > On Tue, 2022-01-11 at 06:31 +, Gerd Petermann wrote: > > > > Hi Ticker, > > > > > > > > didn't try it: Will mkgmap complain when building an indexed > > > > gmapi/gmapsupp > > > > where some tiles where freshly compiled with the new version and > > > > others with > > > > an older (like Felix and Carlos do)? > > > > > > > > Gerd > > > > > > > > > > > > Von: mkgmap-dev im Auftrag > > > > von Ticker Berkin > > > > Gesendet: Montag, 10. Januar 2022 12:04 > > > > An: Development list for mkgmap > > > > Betreff: Re: [mkgmap-dev] Fix and augment sort definitions > > > > > > > > Hi Gerd > > > > > > > > What I meant was that keyboards/devices don't normally have ways of > > > > entering the single chars "…", "¼", "½", "¾", "™". > > > > > > > > Names with these might be presented by Garmin software after some > > > > initial chars have been entered and you can then select the complete > > > > name that contains these chars. > > > > > > > > I didn't see a good reason to remove the expand for these and find > > > > some > > > > arbitrary sort PRIMARY for them. No one has complained about them. > > > > Also > > > > cp65001 had over 1000 expands and I really don't want to start > > > > touching > > > > these. > > > > > > > > Ticker > > > > > > > > > > > > On Mon, 2022-01-10 at 10:29 +, Gerd Petermann wrote: > > > > > > Hi Ticker, > > > > > > > > > > > > I've committed displaySrt_v2.patch . > > > > > > > > > > > > I don't fully understand the comment > > > > > > "Leave the above because no method of inputting them anyway and > > > > > > unlikely at start of names." > > > > > > > > > > > > It is possible to enter these characters in MapSource and I think > > > > > > MapSource uses MDR12 > > > > > > when you type only a few characters for the name
Re: [mkgmap-dev] Fix and augment sort definitions
Hi Ticker, if you don't mind I'd like to postpone this patch until the active branches are merged into trunk. Gerd Von: mkgmap-dev im Auftrag von Ticker Berkin Gesendet: Dienstag, 11. Januar 2022 11:25 An: Development list for mkgmap Betreff: Re: [mkgmap-dev] Fix and augment sort definitions Hi Gerd Yes - gmapsupp builder gives a warning if id1/id2 are not consistent in all the .img files. It is just a warning and gmapsupp is built anyway and I think the warning can be ignored. gmapi doesn't notice. Almost all of the significant sorting where the Garmin device... needs to know the sort details happens in Mdr, so this isn't a problem. Other uses are mostly for de-duping/efficient processing, so these shouldn't matter either. However the LBL file does hold id1/id2 and many sections (Countries, Regions, Cities, Zips, POIs) are sorted so the effect here is unknown. If using --latin2 / 1252, the only change in ordering is around AE/OE dipthongs. Within the same commit or build as sortResource_v2, the attached sortMashExp.patch should be applied, as it effects the binary SRT file and I don't want to increment all the id2's again. This patch changes the sort.expand TERTIARY mashing from 2 to 3, which is slightly more consistent with the Garmin SRT binaries I've seen and allows SrtDisplay to show expansions with what looks like a meaningful case. Ticker On Tue, 2022-01-11 at 06:31 +, Gerd Petermann wrote: > Hi Ticker, > > didn't try it: Will mkgmap complain when building an indexed > gmapi/gmapsupp > where some tiles where freshly compiled with the new version and > others with > an older (like Felix and Carlos do)? > > Gerd > > > Von: mkgmap-dev im Auftrag > von Ticker Berkin > Gesendet: Montag, 10. Januar 2022 12:04 > An: Development list for mkgmap > Betreff: Re: [mkgmap-dev] Fix and augment sort definitions > > Hi Gerd > > What I meant was that keyboards/devices don't normally have ways of > entering the single chars "…", "¼", "½", "¾", "™". > > Names with these might be presented by Garmin software after some > initial chars have been entered and you can then select the complete > name that contains these chars. > > I didn't see a good reason to remove the expand for these and find > some > arbitrary sort PRIMARY for them. No one has complained about them. > Also > cp65001 had over 1000 expands and I really don't want to start > touching > these. > > Ticker > > > On Mon, 2022-01-10 at 10:29 +, Gerd Petermann wrote: > > Hi Ticker, > > > > I've committed displaySrt_v2.patch . > > > > I don't fully understand the comment > > "Leave the above because no method of inputting them anyway and > > unlikely at start of names." > > > > It is possible to enter these characters in MapSource and I think > > MapSource uses MDR12 > > when you type only a few characters for the name of a POI and don't > > pick up an entry from the list. > > > > Gerd > > > > > > Von: mkgmap-dev im Auftrag > > von > > Ticker Berkin > > Gesendet: Montag, 10. Januar 2022 11:20 > > An: Development list for mkgmap > > Betreff: Re: [mkgmap-dev] Fix and augment sort definitions > > > > Hi Gerd > > > > I tried various approaches to fixing "Find" when the fixed length > > Mdr17 > > (maybe also Mdr12) prefix contains sort.expand chars and couldn't > > make > > it work. I could documents these attempts in Sort.java if you feel > > this > > is worthwhile. > > > > New patch attached that, for cp1252, leaves "ß" as its own PRIMARY > > after "s". Moved æ,Æ etc to be PRIMARIES on the grounds that their > > behaviour will be the same as "ß". Made cp1254 consistent as it had > > similar partial fixes. > > > > The main reason for the patch is to fix all the other sort/cp*.txt > > files that had line " > #" which was taken as a comment, resulting > > in > > "#" being ignored in collation. > > > > With the Display patch (sent previously, but also attached here), > > it > > can reproduce the resource/sort file from the binary SRT section. > > > > Ticker > > > > ___ > > mkgmap-dev mailing list > > mkgmap-dev@lists.mkgmap.org.uk > > https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev > > > ___ > mkgmap-dev mailing list > mkgmap-dev@lists.mkgmap.org.uk > https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev > ___ > mkgmap-dev mailing list > mkgmap-dev@lists.mkgmap.org.uk > https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev ___ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
Re: [mkgmap-dev] Fix and augment sort definitions
Hi Gerd Yes - gmapsupp builder gives a warning if id1/id2 are not consistent in all the .img files. It is just a warning and gmapsupp is built anyway and I think the warning can be ignored. gmapi doesn't notice. Almost all of the significant sorting where the Garmin device... needs to know the sort details happens in Mdr, so this isn't a problem. Other uses are mostly for de-duping/efficient processing, so these shouldn't matter either. However the LBL file does hold id1/id2 and many sections (Countries, Regions, Cities, Zips, POIs) are sorted so the effect here is unknown. If using --latin2 / 1252, the only change in ordering is around AE/OE dipthongs. Within the same commit or build as sortResource_v2, the attached sortMashExp.patch should be applied, as it effects the binary SRT file and I don't want to increment all the id2's again. This patch changes the sort.expand TERTIARY mashing from 2 to 3, which is slightly more consistent with the Garmin SRT binaries I've seen and allows SrtDisplay to show expansions with what looks like a meaningful case. Ticker On Tue, 2022-01-11 at 06:31 +, Gerd Petermann wrote: > Hi Ticker, > > didn't try it: Will mkgmap complain when building an indexed > gmapi/gmapsupp > where some tiles where freshly compiled with the new version and > others with > an older (like Felix and Carlos do)? > > Gerd > > > Von: mkgmap-dev im Auftrag > von Ticker Berkin > Gesendet: Montag, 10. Januar 2022 12:04 > An: Development list for mkgmap > Betreff: Re: [mkgmap-dev] Fix and augment sort definitions > > Hi Gerd > > What I meant was that keyboards/devices don't normally have ways of > entering the single chars "…", "¼", "½", "¾", "™". > > Names with these might be presented by Garmin software after some > initial chars have been entered and you can then select the complete > name that contains these chars. > > I didn't see a good reason to remove the expand for these and find > some > arbitrary sort PRIMARY for them. No one has complained about them. > Also > cp65001 had over 1000 expands and I really don't want to start > touching > these. > > Ticker > > > On Mon, 2022-01-10 at 10:29 +, Gerd Petermann wrote: > > Hi Ticker, > > > > I've committed displaySrt_v2.patch . > > > > I don't fully understand the comment > > "Leave the above because no method of inputting them anyway and > > unlikely at start of names." > > > > It is possible to enter these characters in MapSource and I think > > MapSource uses MDR12 > > when you type only a few characters for the name of a POI and don't > > pick up an entry from the list. > > > > Gerd > > > > > > Von: mkgmap-dev im Auftrag > > von > > Ticker Berkin > > Gesendet: Montag, 10. Januar 2022 11:20 > > An: Development list for mkgmap > > Betreff: Re: [mkgmap-dev] Fix and augment sort definitions > > > > Hi Gerd > > > > I tried various approaches to fixing "Find" when the fixed length > > Mdr17 > > (maybe also Mdr12) prefix contains sort.expand chars and couldn't > > make > > it work. I could documents these attempts in Sort.java if you feel > > this > > is worthwhile. > > > > New patch attached that, for cp1252, leaves "ß" as its own PRIMARY > > after "s". Moved æ,Æ etc to be PRIMARIES on the grounds that their > > behaviour will be the same as "ß". Made cp1254 consistent as it had > > similar partial fixes. > > > > The main reason for the patch is to fix all the other sort/cp*.txt > > files that had line " > #" which was taken as a comment, resulting > > in > > "#" being ignored in collation. > > > > With the Display patch (sent previously, but also attached here), > > it > > can reproduce the resource/sort file from the binary SRT section. > > > > Ticker > > > > ___ > > mkgmap-dev mailing list > > mkgmap-dev@lists.mkgmap.org.uk > > https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev > > > ___ > mkgmap-dev mailing list > mkgmap-dev@lists.mkgmap.org.uk > https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev > ___ > mkgmap-dev mailing list > mkgmap-dev@lists.mkgmap.org.uk > https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev Index: src/uk/me/parabola/mkgmap/srt/SrtTextReader.java =
Re: [mkgmap-dev] Fix and augment sort definitions
Hi Ticker, didn't try it: Will mkgmap complain when building an indexed gmapi/gmapsupp where some tiles where freshly compiled with the new version and others with an older (like Felix and Carlos do)? Gerd Von: mkgmap-dev im Auftrag von Ticker Berkin Gesendet: Montag, 10. Januar 2022 12:04 An: Development list for mkgmap Betreff: Re: [mkgmap-dev] Fix and augment sort definitions Hi Gerd What I meant was that keyboards/devices don't normally have ways of entering the single chars "…", "¼", "½", "¾", "™". Names with these might be presented by Garmin software after some initial chars have been entered and you can then select the complete name that contains these chars. I didn't see a good reason to remove the expand for these and find some arbitrary sort PRIMARY for them. No one has complained about them. Also cp65001 had over 1000 expands and I really don't want to start touching these. Ticker On Mon, 2022-01-10 at 10:29 +, Gerd Petermann wrote: > Hi Ticker, > > I've committed displaySrt_v2.patch . > > I don't fully understand the comment > "Leave the above because no method of inputting them anyway and > unlikely at start of names." > > It is possible to enter these characters in MapSource and I think > MapSource uses MDR12 > when you type only a few characters for the name of a POI and don't > pick up an entry from the list. > > Gerd > > > Von: mkgmap-dev im Auftrag von > Ticker Berkin > Gesendet: Montag, 10. Januar 2022 11:20 > An: Development list for mkgmap > Betreff: Re: [mkgmap-dev] Fix and augment sort definitions > > Hi Gerd > > I tried various approaches to fixing "Find" when the fixed length Mdr17 > (maybe also Mdr12) prefix contains sort.expand chars and couldn't make > it work. I could documents these attempts in Sort.java if you feel this > is worthwhile. > > New patch attached that, for cp1252, leaves "ß" as its own PRIMARY > after "s". Moved æ,Æ etc to be PRIMARIES on the grounds that their > behaviour will be the same as "ß". Made cp1254 consistent as it had > similar partial fixes. > > The main reason for the patch is to fix all the other sort/cp*.txt > files that had line " > #" which was taken as a comment, resulting in > "#" being ignored in collation. > > With the Display patch (sent previously, but also attached here), it > can reproduce the resource/sort file from the binary SRT section. > > Ticker > > ___ > mkgmap-dev mailing list > mkgmap-dev@lists.mkgmap.org.uk > https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev ___ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev ___ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
Re: [mkgmap-dev] Fix and augment sort definitions
Hi Gerd What I meant was that keyboards/devices don't normally have ways of entering the single chars "…", "¼", "½", "¾", "™". Names with these might be presented by Garmin software after some initial chars have been entered and you can then select the complete name that contains these chars. I didn't see a good reason to remove the expand for these and find some arbitrary sort PRIMARY for them. No one has complained about them. Also cp65001 had over 1000 expands and I really don't want to start touching these. Ticker On Mon, 2022-01-10 at 10:29 +, Gerd Petermann wrote: > Hi Ticker, > > I've committed displaySrt_v2.patch . > > I don't fully understand the comment > "Leave the above because no method of inputting them anyway and > unlikely at start of names." > > It is possible to enter these characters in MapSource and I think > MapSource uses MDR12 > when you type only a few characters for the name of a POI and don't > pick up an entry from the list. > > Gerd > > > Von: mkgmap-dev im Auftrag von > Ticker Berkin > Gesendet: Montag, 10. Januar 2022 11:20 > An: Development list for mkgmap > Betreff: Re: [mkgmap-dev] Fix and augment sort definitions > > Hi Gerd > > I tried various approaches to fixing "Find" when the fixed length Mdr17 > (maybe also Mdr12) prefix contains sort.expand chars and couldn't make > it work. I could documents these attempts in Sort.java if you feel this > is worthwhile. > > New patch attached that, for cp1252, leaves "ß" as its own PRIMARY > after "s". Moved æ,Æ etc to be PRIMARIES on the grounds that their > behaviour will be the same as "ß". Made cp1254 consistent as it had > similar partial fixes. > > The main reason for the patch is to fix all the other sort/cp*.txt > files that had line " > #" which was taken as a comment, resulting in > "#" being ignored in collation. > > With the Display patch (sent previously, but also attached here), it > can reproduce the resource/sort file from the binary SRT section. > > Ticker > > ___ > mkgmap-dev mailing list > mkgmap-dev@lists.mkgmap.org.uk > https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev ___ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
Re: [mkgmap-dev] Fix and augment sort definitions
Hi Ticker, I've committed displaySrt_v2.patch . I don't fully understand the comment "Leave the above because no method of inputting them anyway and unlikely at start of names." It is possible to enter these characters in MapSource and I think MapSource uses MDR12 when you type only a few characters for the name of a POI and don't pick up an entry from the list. Gerd Von: mkgmap-dev im Auftrag von Ticker Berkin Gesendet: Montag, 10. Januar 2022 11:20 An: Development list for mkgmap Betreff: Re: [mkgmap-dev] Fix and augment sort definitions Hi Gerd I tried various approaches to fixing "Find" when the fixed length Mdr17 (maybe also Mdr12) prefix contains sort.expand chars and couldn't make it work. I could documents these attempts in Sort.java if you feel this is worthwhile. New patch attached that, for cp1252, leaves "ß" as its own PRIMARY after "s". Moved æ,Æ etc to be PRIMARIES on the grounds that their behaviour will be the same as "ß". Made cp1254 consistent as it had similar partial fixes. The main reason for the patch is to fix all the other sort/cp*.txt files that had line " > #" which was taken as a comment, resulting in "#" being ignored in collation. With the Display patch (sent previously, but also attached here), it can reproduce the resource/sort file from the binary SRT section. Ticker ___ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
Re: [mkgmap-dev] Fix and augment sort definitions
Hi Gerd I tried various approaches to fixing "Find" when the fixed length Mdr17 (maybe also Mdr12) prefix contains sort.expand chars and couldn't make it work. I could documents these attempts in Sort.java if you feel this is worthwhile. New patch attached that, for cp1252, leaves "ß" as its own PRIMARY after "s". Moved æ,Æ etc to be PRIMARIES on the grounds that their behaviour will be the same as "ß". Made cp1254 consistent as it had similar partial fixes. The main reason for the patch is to fix all the other sort/cp*.txt files that had line " > #" which was taken as a comment, resulting in "#" being ignored in collation. With the Display patch (sent previously, but also attached here), it can reproduce the resource/sort file from the binary SRT section. Ticker Index: resources/sort/README === --- resources/sort/README (revision 4856) +++ resources/sort/README (working copy) @@ -35,22 +35,24 @@ I believe that these are arbitary identifiers. Here is a registry of values we are using. If you make a variation on a code-page sort-order then give it a different id2 value. +It is believed that having sorts with the same id1/id2 but different data loaded +on the same device will give unexpected results -code-page id1 id2 +code-page id1 description -1250 12 1 -12518 1 -12527 2 -1253 13 1 -1254 14 1 -1255 15 1 -1256 16 1 -1257 17 1 -1258 18 1 -87411 1 -932 9 1 -936 5 1 -94910 1 +1250 12 Central European sort +12518 Cyrillic sort +12527 Western European sort +1253 13 Greek sort +1254 14 Turkish sort +1255 15 Hebrew sort +1256 16?9 Arabic sort cp1256.txt has id1=9, original version of this doc said 16 +1257 17 Latin Baltic sort +1258 18 Vietnamese sort +87411 Thai. 8-bit not implemented +932 9 Japanese. Shift JIS not implemented. Note id1=9 used by 1256 +936 5 Simplified Chinese not implemented +94910 Korean. Unified Hangui not implemented -65001 19 4 -0 00 +65001 19 Unicode sort +0 0ASCII 7-bit sort Index: resources/sort/cp0.txt === --- resources/sort/cp0.txt (revision 4856) +++ resources/sort/cp0.txt (working copy) @@ -1,9 +1,11 @@ codepage 0 id1 0 -id2 1 +# 10-Jan-2022 Increment id2/version. Fix '#' to 0023 +id2 2 description "ASCII 7-bit sort" characters + =0008=000e=000f=0010=0011=0012=0013=0014=0015=0016=0017=0018=0019=001a=001b=001c=001d=001e=001f=007f,0001,0002,0003,0004,0005,0006,0007 < 0009 < 000a @@ -32,7 +34,7 @@ < / < \ < & - < # + < 0023 < % < ` < ^ @@ -79,3 +81,5 @@ < x,X < y,Y < z,Z + +# ends Index: resources/sort/cp1250.txt === --- resources/sort/cp1250.txt (revision 4856) +++ resources/sort/cp1250.txt (working copy) @@ -1,9 +1,11 @@ codepage 1250 id1 12 -id2 1 +# 10-Jan-2022 Increment id2/version. Fix '#' to 0023 +id2 2 description "Central European sort" characters + =0008=000e=000f=0010=0011=0012=0013=0014=0015=0016=0017=0018=0019=001a=001b=001c=001d=001e=001f=007f=00ad,0001,0002,0003,0004,0005,0006,0007 < 0009 < 000a @@ -45,7 +47,7 @@ < / < \ < & - < # + < 0023 < % < ‰ < † @@ -120,3 +122,5 @@ expand ˛ to § 0020 expand ß to s s expand ™ to T M + +# ends Index: resources/sort/cp1251.txt === --- resources/sort/cp1251.txt (revision 4856) +++ resources/sort/cp1251.txt (working copy) @@ -1,9 +1,11 @@ codepage 1251 id1 8 -id2 1 +# 10-Jan-2022 Increment id2/version. Fix '#' to 0023 +id2 2 description "Cyrillic sort" characters + =0008=000e=000f=0010=0011=0012=0013=0014=0015=0016=0017=0018=0019=001a=001b=001c=001d=001e=001f=007f=00ad,0001,0002,0003,0004,0005,0006,0007 < 0009 < 000a @@ -45,7 +47,7 @@ < / < \ < & - < # + < 0023 < % < ‰ < † @@ -152,7 +154,8 @@ < э,Э < ю,Ю < я,Я - expand … to . . . expand № to N o expand ™ to T M + +# ends Index: resources/sort/cp1252.txt === --- resources/sort/cp1252.txt (revision 4856) +++ resources/sort/cp1252.txt (working copy) @@ -1,9 +1,7 @@ - - -# This must be first before any 'code' lines. codepage 1252 id1 7 -id2 2 +# 10-Jan-2022 Increment id2/version. Add comment about expansions. Move AE/ae/OE/oe +id2 3 description "Western European sort" characters @@ -96,7 +94,8 @@ < 7 < 8 < 9 - < a,A,ª ; á,Á ; à,À ; â, ; å,Å ; ä,Ä ; ã,à ; æ,Æ + < a,A,ª ; á,Á ; à,À ; â, ; å,Å ; ä,Ä ; ã,à + < æ,Æ < b,B < c,C ; ç,Ç < d,D ; ð,Ð @@ -111,7 +110,8 @@ < l,L < m,M < n,N ; ñ,Ñ - < o,O,º ; ó,Ó ; ò,Ò ; ô,Ô ; ö,Ö ; õ,Õ ; ø,Ø ; œ,Œ + < o,O,º ; ó,Ó ; ò,Ò ; ô,Ô ; ö,Ö ; õ,Õ
Re: [mkgmap-dev] Fix and augment sort definitions
Hi Gerd On my eTrex 30x: Same --lower-case map as before. Also loaded is GB map that is disabled in Setup>Map; it uses same charset & sort, sort has same id1 but different id2. Where To? > Cities > Spell Search > "VOS" gives "No Results Found" Spell Search has option to change the keyboard language. GERMAN allows input of A/O/U umlaut and ß. "Voß" gives "No Results Found" So it seems that old HCx works well and the new 30x doesn't. Tried again after removing the GB map and still fails. Doing an Address street search (ie skip the city), ß finds ß and ss, ss doesn't find ß. I'll investigate and experiment more, mainly MDR17/PrefixIndex. There is possible problem with sort.collator compareOneStrengthWithLength() in that it counts expansions whereas the short string doesn't. So, for prefix length 4, "Voßaaa" and Voßbbb" are considered equal, but the output is "Voßa" only. Maybe the 2/4 char string should be the major PRIMARY representation - upper case, unaccented, expanded Ticker ___ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
Re: [mkgmap-dev] Fix and augment sort definitions
Hi Gerd I downloaded niedersachsen-latest.osm.pbf. Built with current trunk, my resources/sort changes, option --latin (but not --lower-case). Loaded gmapsupp onto eTrex HCx Find>Cities by name "VOSS" and "VOSSBERG" gives me 3 VOSSBERGs, all in LOWER SAXONY, including the one you mention. The eTrex name entry has a shift that allows entry of accented/eszet/ae etc. "VOß" etc finds the same. Rebuilding with --lower-case, Find>Cities "VOS", "VOSS", "VOSSB" ... all work, showing the 3 "Voßberg"s. "VOß", "VOßB" etc also work. The only strangeness is that the name entry also has a lower case shift, for both standard latin and the extra chars as mentioned. Shifting to lower-case, all letters are disabled except "ß". Find Address doesn't allow entry for this city because none of them have any streets. I've just noticed that my version of trunk has mdrUnicode_v9b.patch applied, but the only significant difference is in MDR25 and will only effect street cities rather then POI ones. Will do the same testing on other eTrex Ticker On Tue, 2022-01-04 at 11:36 +, Gerd Petermann wrote: > Hi Ticker, > > OK, maybe you find more on that. > BTW: Voßberg is very special as it probably also influences the MDR 17 > content. > > I think I'll merge the faster-mp branch into trunk this afternoon and > continue on > the Huffman encoding later. > > Gerd > > > Von: mkgmap-dev im Auftrag von > Ticker Berkin > Gesendet: Dienstag, 4. Januar 2022 12:01 > An: Development list for mkgmap > Betreff: Re: [mkgmap-dev] Fix and augment sort definitions > > Hi Gerd > > I'm just building your area to test on my UK configured devices. > > I'm not sure yet of the benefit of testing with the mdr2 branch. > Actually, until we've better understood the indexing issues thrown up > by use of --lower-case, the extra complications of ß seem too much. > > Ticker > > On Tue, 2022-01-04 at 09:50 +, Gerd Petermann wrote: > > Hi Ticker, > > > > I think you need option -g with svn log to see changes done in > > branches. > > > > Anyhow, I think I made a mistake back then because I didn't think of > > devices which > > are not configured to show a German keyboard (which has keys for äöü > > and ß). > > > > You are probably right that the expands are better for this case and > > I don't > > remember what problems I had with the ß. It is very likely that the > > open > > collator strength questions are the better approach. > > > > I test with a tile around my hometown Wildeshausen > > 55410043: 2447360,389120 to 2469888,407552 > > # : 52.514648,8.349609 to 52.998047,8.745117 > > > > One problem that I found on the Oregon is the search for a > > name that appears as a nearby city called "Voßberg" > > https://www.openstreetmap.org/node/599127249 > > > > This name doesn't appear in the Oregons basemap. > > I created a map with --latin and --lower-case. > > When I search for Voß it is not found. Same when I search > > for Voßber. Only the search for the full name works. > > Voss is also not found, but Vossberg works. > > > > Without the patch, search for Voß returns Voßberg, > > search for Vossberg does that also, which is quite confusing > > for me. > > My basemap shows > > - Description --- > > --- > > 0035 | 41 53 43 49 49 20 53 6f | Description: ASCII Sort > > 003d | 72 74 00 | > > - Character table header > > --- > > 0040 | 00 | 5c 00 | sub header len 92 > > 0042 | 02 | 01 00 | id1 1 > > 0044 | 04 | 01 00 | id2 1 > > 0046 | 06 | e4 04 | codepage 1252 > > > > Maybe I should repeat those tests with the mdr2 branch? > > > > Gerd > > > > > > > > > > > > Von: mkgmap-dev im Auftrag > > von Ticker Berkin > > Gesendet: Dienstag, 4. Januar 2022 09:40 > > An: Development list for mkgmap > > Betreff: Re: [mkgmap-dev] Fix and augment sort definitions > > > > Hi Gerd > > > > Sorry - I hadn't noticed these changes. They don't show up with > > $ svn log resources/sort/cp1252.txt or cp1254.txt > > > > All the other mkgmap sort files have all the expansions possible > > including the eszett and diphthongs if applicab
Re: [mkgmap-dev] Fix and augment sort definitions
Hi Ticker, OK, maybe you find more on that. BTW: Voßberg is very special as it probably also influences the MDR 17 content. I think I'll merge the faster-mp branch into trunk this afternoon and continue on the Huffman encoding later. Gerd Von: mkgmap-dev im Auftrag von Ticker Berkin Gesendet: Dienstag, 4. Januar 2022 12:01 An: Development list for mkgmap Betreff: Re: [mkgmap-dev] Fix and augment sort definitions Hi Gerd I'm just building your area to test on my UK configured devices. I'm not sure yet of the benefit of testing with the mdr2 branch. Actually, until we've better understood the indexing issues thrown up by use of --lower-case, the extra complications of ß seem too much. Ticker On Tue, 2022-01-04 at 09:50 +, Gerd Petermann wrote: > Hi Ticker, > > I think you need option -g with svn log to see changes done in > branches. > > Anyhow, I think I made a mistake back then because I didn't think of > devices which > are not configured to show a German keyboard (which has keys for äöü > and ß). > > You are probably right that the expands are better for this case and > I don't > remember what problems I had with the ß. It is very likely that the > open > collator strength questions are the better approach. > > I test with a tile around my hometown Wildeshausen > 55410043: 2447360,389120 to 2469888,407552 > # : 52.514648,8.349609 to 52.998047,8.745117 > > One problem that I found on the Oregon is the search for a > name that appears as a nearby city called "Voßberg" > https://www.openstreetmap.org/node/599127249 > > This name doesn't appear in the Oregons basemap. > I created a map with --latin and --lower-case. > When I search for Voß it is not found. Same when I search > for Voßber. Only the search for the full name works. > Voss is also not found, but Vossberg works. > > Without the patch, search for Voß returns Voßberg, > search for Vossberg does that also, which is quite confusing > for me. > My basemap shows > - Description --- > --- > 0035 | 41 53 43 49 49 20 53 6f | Description: ASCII Sort > 003d | 72 74 00| > - Character table header > --- > 0040 | 00 | 5c 00 | sub header len 92 > 0042 | 02 | 01 00 | id1 1 > 0044 | 04 | 01 00 | id2 1 > 0046 | 06 | e4 04 | codepage 1252 > > Maybe I should repeat those tests with the mdr2 branch? > > Gerd > > > > > ____________ > Von: mkgmap-dev im Auftrag > von Ticker Berkin > Gesendet: Dienstag, 4. Januar 2022 09:40 > An: Development list for mkgmap > Betreff: Re: [mkgmap-dev] Fix and augment sort definitions > > Hi Gerd > > Sorry - I hadn't noticed these changes. They don't show up with > $ svn log resources/sort/cp1252.txt or cp1254.txt > > All the other mkgmap sort files have all the expansions possible > including the eszett and diphthongs if applicable. > > The two non-mkgmap sort files (848.SRT/Turkey and > I3A0.SRT/adriatic TOPO) have expand for "ß" and some of "Œ"... so > I > presumed it was expected and reasonably supported. > > In the binaries, the expand is expressed as a list of sortOrders > {primary,secondary,tertiary}. The secondary and tertiary are > disrupted > and don't match ones from actual characters (in the case of "ß", the > two s's get different secondaries). So these double chars will sort > after the real char and only match with PRIMARY. > > As there many unknowns about how to make --lower-case indexing work > and > the setting, regarding collation strength, of the bit-flag indicating > same-name in some of the MDR sections, I feel that it is better to > have > the all the expands. > > However, if you are against this, I'll redo cp1252 without these > expansions. I'm not sure of the basis of having the diphthongs as > alternate secondaries of their first character and the eszett as a > unique character. > > Ticker > > On Mon, 2022-01-03 at 11:44 +0100, Gerd Petermann wrote: > > Hi Ticker, > > > > see > > https://www.mkgmap.org.uk/websvn/revision.php?repname=mkgmap=3948 > > https://www.mkgmap.org.uk/websvn/revision.php?repname=mkgmap=3949 > > > > the sortResource.patch reverts these changes. > > > > In Mapsource the results are a bit better with your patch. > > I'll try again with my Oregon later. > > > > Gerd > > ___ > > mkgmap-dev mailing list > >
Re: [mkgmap-dev] Fix and augment sort definitions
Hi Gerd I'm just building your area to test on my UK configured devices. I'm not sure yet of the benefit of testing with the mdr2 branch. Actually, until we've better understood the indexing issues thrown up by use of --lower-case, the extra complications of ß seem too much. Ticker On Tue, 2022-01-04 at 09:50 +, Gerd Petermann wrote: > Hi Ticker, > > I think you need option -g with svn log to see changes done in > branches. > > Anyhow, I think I made a mistake back then because I didn't think of > devices which > are not configured to show a German keyboard (which has keys for äöü > and ß). > > You are probably right that the expands are better for this case and > I don't > remember what problems I had with the ß. It is very likely that the > open > collator strength questions are the better approach. > > I test with a tile around my hometown Wildeshausen > 55410043: 2447360,389120 to 2469888,407552 > # : 52.514648,8.349609 to 52.998047,8.745117 > > One problem that I found on the Oregon is the search for a > name that appears as a nearby city called "Voßberg" > https://www.openstreetmap.org/node/599127249 > > This name doesn't appear in the Oregons basemap. > I created a map with --latin and --lower-case. > When I search for Voß it is not found. Same when I search > for Voßber. Only the search for the full name works. > Voss is also not found, but Vossberg works. > > Without the patch, search for Voß returns Voßberg, > search for Vossberg does that also, which is quite confusing > for me. > My basemap shows > - Description --- > --- > 0035 | 41 53 43 49 49 20 53 6f | Description: ASCII Sort > 003d | 72 74 00 | > - Character table header > --- > 0040 | 00 | 5c 00 | sub header len 92 > 0042 | 02 | 01 00 | id1 1 > 0044 | 04 | 01 00 | id2 1 > 0046 | 06 | e4 04 | codepage 1252 > > Maybe I should repeat those tests with the mdr2 branch? > > Gerd > > > > > ____________ > Von: mkgmap-dev im Auftrag > von Ticker Berkin > Gesendet: Dienstag, 4. Januar 2022 09:40 > An: Development list for mkgmap > Betreff: Re: [mkgmap-dev] Fix and augment sort definitions > > Hi Gerd > > Sorry - I hadn't noticed these changes. They don't show up with > $ svn log resources/sort/cp1252.txt or cp1254.txt > > All the other mkgmap sort files have all the expansions possible > including the eszett and diphthongs if applicable. > > The two non-mkgmap sort files (848.SRT/Turkey and > I3A0.SRT/adriatic TOPO) have expand for "ß" and some of "Œ"... so > I > presumed it was expected and reasonably supported. > > In the binaries, the expand is expressed as a list of sortOrders > {primary,secondary,tertiary}. The secondary and tertiary are > disrupted > and don't match ones from actual characters (in the case of "ß", the > two s's get different secondaries). So these double chars will sort > after the real char and only match with PRIMARY. > > As there many unknowns about how to make --lower-case indexing work > and > the setting, regarding collation strength, of the bit-flag indicating > same-name in some of the MDR sections, I feel that it is better to > have > the all the expands. > > However, if you are against this, I'll redo cp1252 without these > expansions. I'm not sure of the basis of having the diphthongs as > alternate secondaries of their first character and the eszett as a > unique character. > > Ticker > > On Mon, 2022-01-03 at 11:44 +0100, Gerd Petermann wrote: > > Hi Ticker, > > > > see > > https://www.mkgmap.org.uk/websvn/revision.php?repname=mkgmap=3948 > > https://www.mkgmap.org.uk/websvn/revision.php?repname=mkgmap=3949 > > > > the sortResource.patch reverts these changes. > > > > In Mapsource the results are a bit better with your patch. > > I'll try again with my Oregon later. > > > > Gerd > > ___ > > mkgmap-dev mailing list > > mkgmap-dev@lists.mkgmap.org.uk > > https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev > > > ___ > mkgmap-dev mailing list > mkgmap-dev@lists.mkgmap.org.uk > https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev > ___ > mkgmap-dev mailing list > mkgmap-dev@lists.mkgmap.org.uk > https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev ___ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
Re: [mkgmap-dev] Fix and augment sort definitions
Hi Ticker, I think you need option -g with svn log to see changes done in branches. Anyhow, I think I made a mistake back then because I didn't think of devices which are not configured to show a German keyboard (which has keys for äöü and ß). You are probably right that the expands are better for this case and I don't remember what problems I had with the ß. It is very likely that the open collator strength questions are the better approach. I test with a tile around my hometown Wildeshausen 55410043: 2447360,389120 to 2469888,407552 # : 52.514648,8.349609 to 52.998047,8.745117 One problem that I found on the Oregon is the search for a name that appears as a nearby city called "Voßberg" https://www.openstreetmap.org/node/599127249 This name doesn't appear in the Oregons basemap. I created a map with --latin and --lower-case. When I search for Voß it is not found. Same when I search for Voßber. Only the search for the full name works. Voss is also not found, but Vossberg works. Without the patch, search for Voß returns Voßberg, search for Vossberg does that also, which is quite confusing for me. My basemap shows - Description -- 0035 | 41 53 43 49 49 20 53 6f | Description: ASCII Sort 003d | 72 74 00| - Character table header --- 0040 | 00 | 5c 00 | sub header len 92 0042 | 02 | 01 00 | id1 1 0044 | 04 | 01 00 | id2 1 0046 | 06 | e4 04 | codepage 1252 Maybe I should repeat those tests with the mdr2 branch? Gerd Von: mkgmap-dev im Auftrag von Ticker Berkin Gesendet: Dienstag, 4. Januar 2022 09:40 An: Development list for mkgmap Betreff: Re: [mkgmap-dev] Fix and augment sort definitions Hi Gerd Sorry - I hadn't noticed these changes. They don't show up with $ svn log resources/sort/cp1252.txt or cp1254.txt All the other mkgmap sort files have all the expansions possible including the eszett and diphthongs if applicable. The two non-mkgmap sort files (848.SRT/Turkey and I3A0.SRT/adriatic TOPO) have expand for "ß" and some of "Œ"... so I presumed it was expected and reasonably supported. In the binaries, the expand is expressed as a list of sortOrders {primary,secondary,tertiary}. The secondary and tertiary are disrupted and don't match ones from actual characters (in the case of "ß", the two s's get different secondaries). So these double chars will sort after the real char and only match with PRIMARY. As there many unknowns about how to make --lower-case indexing work and the setting, regarding collation strength, of the bit-flag indicating same-name in some of the MDR sections, I feel that it is better to have the all the expands. However, if you are against this, I'll redo cp1252 without these expansions. I'm not sure of the basis of having the diphthongs as alternate secondaries of their first character and the eszett as a unique character. Ticker On Mon, 2022-01-03 at 11:44 +0100, Gerd Petermann wrote: > Hi Ticker, > > see > https://www.mkgmap.org.uk/websvn/revision.php?repname=mkgmap=3948 > https://www.mkgmap.org.uk/websvn/revision.php?repname=mkgmap=3949 > > the sortResource.patch reverts these changes. > > In Mapsource the results are a bit better with your patch. > I'll try again with my Oregon later. > > Gerd > ___ > mkgmap-dev mailing list > mkgmap-dev@lists.mkgmap.org.uk > https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev ___ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev ___ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
Re: [mkgmap-dev] Fix and augment sort definitions
Hi Gerd Sorry - I hadn't noticed these changes. They don't show up with $ svn log resources/sort/cp1252.txt or cp1254.txt All the other mkgmap sort files have all the expansions possible including the eszett and diphthongs if applicable. The two non-mkgmap sort files (848.SRT/Turkey and I3A0.SRT/adriatic TOPO) have expand for "ß" and some of "Œ"... so I presumed it was expected and reasonably supported. In the binaries, the expand is expressed as a list of sortOrders {primary,secondary,tertiary}. The secondary and tertiary are disrupted and don't match ones from actual characters (in the case of "ß", the two s's get different secondaries). So these double chars will sort after the real char and only match with PRIMARY. As there many unknowns about how to make --lower-case indexing work and the setting, regarding collation strength, of the bit-flag indicating same-name in some of the MDR sections, I feel that it is better to have the all the expands. However, if you are against this, I'll redo cp1252 without these expansions. I'm not sure of the basis of having the diphthongs as alternate secondaries of their first character and the eszett as a unique character. Ticker On Mon, 2022-01-03 at 11:44 +0100, Gerd Petermann wrote: > Hi Ticker, > > see > https://www.mkgmap.org.uk/websvn/revision.php?repname=mkgmap=3948 > https://www.mkgmap.org.uk/websvn/revision.php?repname=mkgmap=3949 > > the sortResource.patch reverts these changes. > > In Mapsource the results are a bit better with your patch. > I'll try again with my Oregon later. > > Gerd > ___ > mkgmap-dev mailing list > mkgmap-dev@lists.mkgmap.org.uk > https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev ___ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
[mkgmap-dev] Fix and augment sort definitions
Hi Ticker, see https://www.mkgmap.org.uk/websvn/revision.php?repname=mkgmap=3948 https://www.mkgmap.org.uk/websvn/revision.php?repname=mkgmap=3949 the sortResource.patch reverts these changes. In Mapsource the results are a bit better with your patch. I'll try again with my Oregon later. Gerd ___ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
[mkgmap-dev] Fix and augment sort definitions
Hi Gerd In the resource/sort/cp*.txt sort definitions, I notices that all but cp1252.txt/Western European and cp65001.txt/unicode don't handle "#" correctly and cp1252.txt/Western European and cp1254.txt/Turkish don't add expansions for all their double-characters. Patch attached that fixes these problems, makes the layout consistent, adds useful information to README and removes the id2 current value, which just gets out-of-date. All the id2 values are incremented if appropriate. The "#" fix might be the reason why searching for street names starting with # didn't work for code-page=0 / format6 Adding the "ß" expansion might be a step towards resolving the problem with searching and --lower-case that the subject of discussion a year or so ago - I can't find the thread. Ticker Index: resources/sort/README === --- resources/sort/README (revision 4842) +++ resources/sort/README (working copy) @@ -35,22 +35,24 @@ I believe that these are arbitary identifiers. Here is a registry of values we are using. If you make a variation on a code-page sort-order then give it a different id2 value. +It is believed that having sorts with the same id1/id2 but different data loaded +on the same device will give unexpected results -code-page id1 id2 +code-page id1 description -1250 12 1 -12518 1 -12527 2 -1253 13 1 -1254 14 1 -1255 15 1 -1256 16 1 -1257 17 1 -1258 18 1 -87411 1 -932 9 1 -936 5 1 -94910 1 +1250 12 Central European sort +12518 Cyrillic sort +12527 Western European sort +1253 13 Greek sort +1254 14 Turkish sort +1255 15 Hebrew sort +1256 16?9 Arabic sort cp1256.txt has id1=9, original version of this doc said 16 +1257 17 Latin Baltic sort +1258 18 Vietnamese sort +87411 Thai. 8-bit not implemented +932 9 Japanese. Shift JIS not implemented. Note id1=9 used by 1256 +936 5 Simplified Chinese not implemented +94910 Korean. Unified Hangui not implemented -65001 19 4 -0 00 +65001 19 Unicode sort +0 0ASCII 7-bit sort Index: resources/sort/cp0.txt === --- resources/sort/cp0.txt (revision 4842) +++ resources/sort/cp0.txt (working copy) @@ -1,9 +1,11 @@ codepage 0 id1 0 -id2 1 +# 02-Jan-2022 changed version from 1 to 2. fix '#' to 0023 +id2 2 description "ASCII 7-bit sort" characters + =0008=000e=000f=0010=0011=0012=0013=0014=0015=0016=0017=0018=0019=001a=001b=001c=001d=001e=001f=007f,0001,0002,0003,0004,0005,0006,0007 < 0009 < 000a @@ -32,7 +34,7 @@ < / < \ < & - < # + < 0023 < % < ` < ^ @@ -79,3 +81,5 @@ < x,X < y,Y < z,Z + +# ends Index: resources/sort/cp1250.txt === --- resources/sort/cp1250.txt (revision 4842) +++ resources/sort/cp1250.txt (working copy) @@ -1,9 +1,11 @@ codepage 1250 id1 12 -id2 1 +# 02-Jan-2022 changed version from 1 to 2. fix '#' to 0023 +id2 2 description "Central European sort" characters + =0008=000e=000f=0010=0011=0012=0013=0014=0015=0016=0017=0018=0019=001a=001b=001c=001d=001e=001f=007f=00ad,0001,0002,0003,0004,0005,0006,0007 < 0009 < 000a @@ -45,7 +47,7 @@ < / < \ < & - < # + < 0023 < % < ‰ < † @@ -120,3 +122,5 @@ expand ˛ to § 0020 expand ß to s s expand ™ to T M + +# ends Index: resources/sort/cp1251.txt === --- resources/sort/cp1251.txt (revision 4842) +++ resources/sort/cp1251.txt (working copy) @@ -1,9 +1,11 @@ codepage 1251 id1 8 -id2 1 +# 02-Jan-2022 changed version from 1 to 2. fix '#' to 0023 +id2 2 description "Cyrillic sort" characters + =0008=000e=000f=0010=0011=0012=0013=0014=0015=0016=0017=0018=0019=001a=001b=001c=001d=001e=001f=007f=00ad,0001,0002,0003,0004,0005,0006,0007 < 0009 < 000a @@ -45,7 +47,7 @@ < / < \ < & - < # + < 0023 < % < ‰ < † @@ -152,7 +154,8 @@ < э,Э < ю,Ю < я,Я - expand … to . . . expand № to N o expand ™ to T M + +# ends Index: resources/sort/cp1252.txt === --- resources/sort/cp1252.txt (revision 4842) +++ resources/sort/cp1252.txt (working copy) @@ -1,9 +1,7 @@ - - -# This must be first before any 'code' lines. codepage 1252 id1 7 -id2 2 +# 15-Dec-2021 changed version from 2 to 3. Add expansions for AE/ae/OE/oe/ss +id2 3 description "Western European sort" characters @@ -96,7 +94,7 @@ < 7 < 8 < 9 - < a,A,ª ; á,Á ; à,À ; â, ; å,Å ; ä,Ä ; ã,à ; æ,Æ + < a,A,ª ; á,Á ; à,À ; â, ; å,Å ; ä,Ä ; ã,à < b,B < c,C ; ç,Ç < d,D ; ð,Ð @@ -111,12 +109,11 @@ < l,L < m,M < n,N ; ñ,Ñ - < o,O,º ; ó,Ó ; ò,Ò ; ô,Ô ; ö,Ö ; õ,Õ ; ø,Ø ; œ,Œ + < o,O,º ; ó,Ó ;