Re: [mkgmap-dev] Fix and augment sort definitions

2023-10-12 Thread Ticker Berkin
Hi Gerd

The same fix was applied to cp1252.txt some time ago and this patch fixes the
rest of the sort/cp*.txt in the same way - making them do what the writer
intended.

I don't think a unit test for this is worth the time or effort.

Ticker 

On Thu, 2023-10-12 at 08:14 +, Gerd Petermann wrote:
> > Hi Ticker,
> > 
> > please can you provide a unit test for this?
> > 
> > Gerd
> > 
> > 
> > Von: mkgmap-dev  im Auftrag von
> > Ticker
> > Berkin 
> > Gesendet: Donnerstag, 12. Oktober 2023 09:18
> > An: Development list for mkgmap
> > Betreff: Re: [mkgmap-dev] Fix and augment sort definitions
> > 
> > Hi Gerd
> > 
> > Last year it was reported that string ordering with the '#' character was
> > incorrect. This was
> > because, in the sort/cp*.txt files, the relevant line with the '#' was taken
> > as a comment.
> > 
> > I had a patch that fixed all the files, but it also attempted to do more
> > with
> > ß/ss and
> > dipthongs.
> > 
> > I've done another patch that doesn't have any contentious changes, just
> > fixes
> > the #, makes the
> > layout consistent between the files, increments the version/id2 values and
> > slight improvements
> > to the documentation.
> > 
> > Ticker
> > 
> > On Tue, 2022-01-11 at 14:00 +, Gerd Petermann wrote:
> > > > > > Hi Ticker,
> > > > > > 
> > > > > > if you don't mind I'd like to postpone this patch until the active
> > > > > > branches are merged into
> > > > > > trunk.
> > > > > > 
> > > > > > Gerd
> > > > > > 
> > > > > > 
> > > > > > 
> > > > > > Von: mkgmap-dev  im Auftrag
> > > > > > von
> > > > > > Ticker Berkin
> > > > > > 
> > > > > > Gesendet: Dienstag, 11. Januar 2022 11:25
> > > > > > An: Development list for mkgmap
> > > > > > Betreff: Re: [mkgmap-dev] Fix and augment sort definitions
> > > > > > 
> > > > > > Hi Gerd
> > > > > > 
> > > > > > Yes - gmapsupp builder gives a warning if id1/id2 are not consistent
> > > > > > in
> > > > > > all the .img files. It is just a warning and gmapsupp is built
> > > > > > anyway
> > > > > > and I think the warning can be ignored. gmapi doesn't notice.
> > > > > > 
> > > > > > Almost all of the significant sorting where the Garmin device...
> > > > > > needs
> > > > > > to know the sort details happens in Mdr, so this isn't a problem.
> > > > > > 
> > > > > > Other uses are mostly for de-duping/efficient processing, so these
> > > > > > shouldn't matter either.
> > > > > > 
> > > > > > However the LBL file does hold id1/id2 and many sections (Countries,
> > > > > > Regions, Cities, Zips, POIs) are sorted so the effect here is
> > > > > > unknown.
> > > > > > 
> > > > > > If using --latin2 / 1252, the only change in ordering is around
> > > > > > AE/OE
> > > > > > dipthongs.
> > > > > > 
> > > > > > Within the same commit or build as sortResource_v2, the attached
> > > > > > sortMashExp.patch should be applied, as it effects the binary SRT
> > > > > > file
> > > > > > and I don't want to increment all the id2's again. This patch
> > > > > > changes
> > > > > > the sort.expand TERTIARY mashing from 2 to 3, which is slightly more
> > > > > > consistent with the Garmin SRT binaries I've seen and allows
> > > > > > SrtDisplay
> > > > > > to show expansions with what looks like a meaningful case.
> > > > > > 
> > > > > > Ticker
> > > > > > 
> > > > > > On Tue, 2022-01-11 at 06:31 +, Gerd Petermann wrote:
> > > > > > > > > > Hi Ticker,
> > > > > > > > > > 
> > > > > > > > > > didn't try it: Will mkgmap complain when building an indexed
> > > > > > > > > > gmapi/gmapsupp
> > > > > > > > > > where some tiles where freshly c

Re: [mkgmap-dev] Fix and augment sort definitions

2023-10-12 Thread Gerd Petermann
Hi Ticker,

please can you provide a unit test for this?

Gerd


Von: mkgmap-dev  im Auftrag von Ticker 
Berkin 
Gesendet: Donnerstag, 12. Oktober 2023 09:18
An: Development list for mkgmap
Betreff: Re: [mkgmap-dev] Fix and augment sort definitions

Hi Gerd

Last year it was reported that string ordering with the '#' character was 
incorrect. This was
because, in the sort/cp*.txt files, the relevant line with the '#' was taken as 
a comment.

I had a patch that fixed all the files, but it also attempted to do more with 
ß/ss and
dipthongs.

I've done another patch that doesn't have any contentious changes, just fixes 
the #, makes the
layout consistent between the files, increments the version/id2 values and 
slight improvements
to the documentation.

Ticker

On Tue, 2022-01-11 at 14:00 +, Gerd Petermann wrote:
> > Hi Ticker,
> >
> > if you don't mind I'd like to postpone this patch until the active branches 
> > are merged into
> > trunk.
> >
> > Gerd
> >
> >
> > 
> > Von: mkgmap-dev  im Auftrag von 
> > Ticker Berkin
> > 
> > Gesendet: Dienstag, 11. Januar 2022 11:25
> > An: Development list for mkgmap
> > Betreff: Re: [mkgmap-dev] Fix and augment sort definitions
> >
> > Hi Gerd
> >
> > Yes - gmapsupp builder gives a warning if id1/id2 are not consistent in
> > all the .img files. It is just a warning and gmapsupp is built anyway
> > and I think the warning can be ignored. gmapi doesn't notice.
> >
> > Almost all of the significant sorting where the Garmin device... needs
> > to know the sort details happens in Mdr, so this isn't a problem.
> >
> > Other uses are mostly for de-duping/efficient processing, so these
> > shouldn't matter either.
> >
> > However the LBL file does hold id1/id2 and many sections (Countries,
> > Regions, Cities, Zips, POIs) are sorted so the effect here is unknown.
> >
> > If using --latin2 / 1252, the only change in ordering is around AE/OE
> > dipthongs.
> >
> > Within the same commit or build as sortResource_v2, the attached
> > sortMashExp.patch should be applied, as it effects the binary SRT file
> > and I don't want to increment all the id2's again. This patch changes
> > the sort.expand TERTIARY mashing from 2 to 3, which is slightly more
> > consistent with the Garmin SRT binaries I've seen and allows SrtDisplay
> > to show expansions with what looks like a meaningful case.
> >
> > Ticker
> >
> > On Tue, 2022-01-11 at 06:31 +, Gerd Petermann wrote:
> > > > Hi Ticker,
> > > >
> > > > didn't try it: Will mkgmap complain when building an indexed
> > > > gmapi/gmapsupp
> > > > where some tiles where freshly compiled with the new version and
> > > > others with
> > > > an older (like Felix and Carlos do)?
> > > >
> > > > Gerd
> > > >
> > > > 
> > > > Von: mkgmap-dev  im Auftrag
> > > > von Ticker Berkin 
> > > > Gesendet: Montag, 10. Januar 2022 12:04
> > > > An: Development list for mkgmap
> > > > Betreff: Re: [mkgmap-dev] Fix and augment sort definitions
> > > >
> > > > Hi Gerd
> > > >
> > > > What I meant was that keyboards/devices don't normally have ways of
> > > > entering the single chars "…", "¼", "½", "¾", "™".
> > > >
> > > > Names with these might be presented by Garmin software after some
> > > > initial chars have been entered and you can then select the complete
> > > > name that contains these chars.
> > > >
> > > > I didn't see a good reason to remove the expand for these and find
> > > > some
> > > > arbitrary sort PRIMARY for them. No one has complained about them.
> > > > Also
> > > > cp65001 had over 1000 expands and I really don't want to start
> > > > touching
> > > > these.
> > > >
> > > > Ticker
> > > >
> > > >
> > > > On Mon, 2022-01-10 at 10:29 +, Gerd Petermann wrote:
> > > > > > Hi Ticker,
> > > > > >
> > > > > > I've committed displaySrt_v2.patch .
> > > > > >
> > > > > > I don't fully understand the comment
> > > > > > "Leave the above because no method of inputting them anyway and
> > > > > > unlikely at start of names

Re: [mkgmap-dev] Fix and augment sort definitions

2023-10-12 Thread Ticker Berkin
Hi Gerd

Last year it was reported that string ordering with the '#' character was 
incorrect. This was
because, in the sort/cp*.txt files, the relevant line with the '#' was taken as 
a comment.

I had a patch that fixed all the files, but it also attempted to do more with 
ß/ss and
dipthongs.

I've done another patch that doesn't have any contentious changes, just fixes 
the #, makes the
layout consistent between the files, increments the version/id2 values and 
slight improvements
to the documentation.

Ticker

On Tue, 2022-01-11 at 14:00 +, Gerd Petermann wrote:
> > Hi Ticker,
> > 
> > if you don't mind I'd like to postpone this patch until the active branches 
> > are merged into
> > trunk.
> > 
> > Gerd
> > 
> > 
> > 
> > Von: mkgmap-dev  im Auftrag von 
> > Ticker Berkin
> > 
> > Gesendet: Dienstag, 11. Januar 2022 11:25
> > An: Development list for mkgmap
> > Betreff: Re: [mkgmap-dev] Fix and augment sort definitions
> > 
> > Hi Gerd
> > 
> > Yes - gmapsupp builder gives a warning if id1/id2 are not consistent in
> > all the .img files. It is just a warning and gmapsupp is built anyway
> > and I think the warning can be ignored. gmapi doesn't notice.
> > 
> > Almost all of the significant sorting where the Garmin device... needs
> > to know the sort details happens in Mdr, so this isn't a problem.
> > 
> > Other uses are mostly for de-duping/efficient processing, so these
> > shouldn't matter either.
> > 
> > However the LBL file does hold id1/id2 and many sections (Countries,
> > Regions, Cities, Zips, POIs) are sorted so the effect here is unknown.
> > 
> > If using --latin2 / 1252, the only change in ordering is around AE/OE
> > dipthongs.
> > 
> > Within the same commit or build as sortResource_v2, the attached
> > sortMashExp.patch should be applied, as it effects the binary SRT file
> > and I don't want to increment all the id2's again. This patch changes
> > the sort.expand TERTIARY mashing from 2 to 3, which is slightly more
> > consistent with the Garmin SRT binaries I've seen and allows SrtDisplay
> > to show expansions with what looks like a meaningful case.
> > 
> > Ticker
> > 
> > On Tue, 2022-01-11 at 06:31 +, Gerd Petermann wrote:
> > > > Hi Ticker,
> > > > 
> > > > didn't try it: Will mkgmap complain when building an indexed
> > > > gmapi/gmapsupp
> > > > where some tiles where freshly compiled with the new version and
> > > > others with
> > > > an older (like Felix and Carlos do)?
> > > > 
> > > > Gerd
> > > > 
> > > > 
> > > > Von: mkgmap-dev  im Auftrag
> > > > von Ticker Berkin 
> > > > Gesendet: Montag, 10. Januar 2022 12:04
> > > > An: Development list for mkgmap
> > > > Betreff: Re: [mkgmap-dev] Fix and augment sort definitions
> > > > 
> > > > Hi Gerd
> > > > 
> > > > What I meant was that keyboards/devices don't normally have ways of
> > > > entering the single chars "…", "¼", "½", "¾", "™".
> > > > 
> > > > Names with these might be presented by Garmin software after some
> > > > initial chars have been entered and you can then select the complete
> > > > name that contains these chars.
> > > > 
> > > > I didn't see a good reason to remove the expand for these and find
> > > > some
> > > > arbitrary sort PRIMARY for them. No one has complained about them.
> > > > Also
> > > > cp65001 had over 1000 expands and I really don't want to start
> > > > touching
> > > > these.
> > > > 
> > > > Ticker
> > > > 
> > > > 
> > > > On Mon, 2022-01-10 at 10:29 +, Gerd Petermann wrote:
> > > > > > Hi Ticker,
> > > > > > 
> > > > > > I've committed displaySrt_v2.patch .
> > > > > > 
> > > > > > I don't fully understand the comment
> > > > > > "Leave the above because no method of inputting them anyway and
> > > > > > unlikely at start of names."
> > > > > > 
> > > > > > It is possible to enter these characters in MapSource and I think
> > > > > > MapSource uses MDR12
> > > > > > when you type only a few characters for the name 

Re: [mkgmap-dev] Fix and augment sort definitions

2022-01-11 Thread Gerd Petermann
Hi Ticker,

if you don't mind I'd like to postpone this patch until the active branches are 
merged into trunk.

Gerd



Von: mkgmap-dev  im Auftrag von Ticker 
Berkin 
Gesendet: Dienstag, 11. Januar 2022 11:25
An: Development list for mkgmap
Betreff: Re: [mkgmap-dev] Fix and augment sort definitions

Hi Gerd

Yes - gmapsupp builder gives a warning if id1/id2 are not consistent in
all the .img files. It is just a warning and gmapsupp is built anyway
and I think the warning can be ignored. gmapi doesn't notice.

Almost all of the significant sorting where the Garmin device... needs
to know the sort details happens in Mdr, so this isn't a problem.

Other uses are mostly for de-duping/efficient processing, so these
shouldn't matter either.

However the LBL file does hold id1/id2 and many sections (Countries,
Regions, Cities, Zips, POIs) are sorted so the effect here is unknown.

If using --latin2 / 1252, the only change in ordering is around AE/OE
dipthongs.

Within the same commit or build as sortResource_v2, the attached
sortMashExp.patch should be applied, as it effects the binary SRT file
and I don't want to increment all the id2's again. This patch changes
the sort.expand TERTIARY mashing from 2 to 3, which is slightly more
consistent with the Garmin SRT binaries I've seen and allows SrtDisplay
to show expansions with what looks like a meaningful case.

Ticker

On Tue, 2022-01-11 at 06:31 +, Gerd Petermann wrote:
> Hi Ticker,
>
> didn't try it: Will mkgmap complain when building an indexed
> gmapi/gmapsupp
> where some tiles where freshly compiled with the new version and
> others with
> an older (like Felix and Carlos do)?
>
> Gerd
>
> 
> Von: mkgmap-dev  im Auftrag
> von Ticker Berkin 
> Gesendet: Montag, 10. Januar 2022 12:04
> An: Development list for mkgmap
> Betreff: Re: [mkgmap-dev] Fix and augment sort definitions
>
> Hi Gerd
>
> What I meant was that keyboards/devices don't normally have ways of
> entering the single chars "…", "¼", "½", "¾", "™".
>
> Names with these might be presented by Garmin software after some
> initial chars have been entered and you can then select the complete
> name that contains these chars.
>
> I didn't see a good reason to remove the expand for these and find
> some
> arbitrary sort PRIMARY for them. No one has complained about them.
> Also
> cp65001 had over 1000 expands and I really don't want to start
> touching
> these.
>
> Ticker
>
>
> On Mon, 2022-01-10 at 10:29 +, Gerd Petermann wrote:
> > Hi Ticker,
> >
> > I've committed displaySrt_v2.patch .
> >
> > I don't fully understand the comment
> > "Leave the above because no method of inputting them anyway and
> > unlikely at start of names."
> >
> > It is possible to enter these characters in MapSource and I think
> > MapSource uses MDR12
> > when you type only a few characters for the name of a POI and don't
> > pick up an entry from the list.
> >
> > Gerd
> >
> > 
> > Von: mkgmap-dev  im Auftrag
> > von
> > Ticker Berkin 
> > Gesendet: Montag, 10. Januar 2022 11:20
> > An: Development list for mkgmap
> > Betreff: Re: [mkgmap-dev] Fix and augment sort definitions
> >
> > Hi Gerd
> >
> > I tried various approaches to fixing "Find" when the fixed length
> > Mdr17
> > (maybe also Mdr12) prefix contains sort.expand chars and couldn't
> > make
> > it work. I could documents these attempts in Sort.java if you feel
> > this
> > is worthwhile.
> >
> > New patch attached that, for cp1252, leaves "ß" as its own PRIMARY
> > after "s". Moved æ,Æ etc to be PRIMARIES on the grounds that their
> > behaviour will be the same as "ß". Made cp1254 consistent as it had
> > similar partial fixes.
> >
> > The main reason for the patch is to fix all the other sort/cp*.txt
> > files that had line " > #" which was taken as a comment, resulting
> > in
> > "#" being ignored in collation.
> >
> > With the Display patch (sent previously, but also attached here),
> > it
> > can reproduce the resource/sort file from the binary SRT section.
> >
> > Ticker
> >
> > ___
> > mkgmap-dev mailing list
> > mkgmap-dev@lists.mkgmap.org.uk
> > https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
>
>
> ___
> mkgmap-dev mailing list
> mkgmap-dev@lists.mkgmap.org.uk
> https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
> ___
> mkgmap-dev mailing list
> mkgmap-dev@lists.mkgmap.org.uk
> https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

___
mkgmap-dev mailing list
mkgmap-dev@lists.mkgmap.org.uk
https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev


Re: [mkgmap-dev] Fix and augment sort definitions

2022-01-11 Thread Ticker Berkin
Hi Gerd

Yes - gmapsupp builder gives a warning if id1/id2 are not consistent in
all the .img files. It is just a warning and gmapsupp is built anyway
and I think the warning can be ignored. gmapi doesn't notice.

Almost all of the significant sorting where the Garmin device... needs
to know the sort details happens in Mdr, so this isn't a problem.

Other uses are mostly for de-duping/efficient processing, so these
shouldn't matter either.

However the LBL file does hold id1/id2 and many sections (Countries,
Regions, Cities, Zips, POIs) are sorted so the effect here is unknown.

If using --latin2 / 1252, the only change in ordering is around AE/OE
dipthongs.

Within the same commit or build as sortResource_v2, the attached
sortMashExp.patch should be applied, as it effects the binary SRT file
and I don't want to increment all the id2's again. This patch changes
the sort.expand TERTIARY mashing from 2 to 3, which is slightly more
consistent with the Garmin SRT binaries I've seen and allows SrtDisplay
to show expansions with what looks like a meaningful case.

Ticker

On Tue, 2022-01-11 at 06:31 +, Gerd Petermann wrote:
> Hi Ticker,
> 
> didn't try it: Will mkgmap complain when building an indexed
> gmapi/gmapsupp
> where some tiles where freshly compiled with the new version and
> others with
> an older (like Felix and Carlos do)?
> 
> Gerd
> 
> 
> Von: mkgmap-dev  im Auftrag
> von Ticker Berkin 
> Gesendet: Montag, 10. Januar 2022 12:04
> An: Development list for mkgmap
> Betreff: Re: [mkgmap-dev] Fix and augment sort definitions
> 
> Hi Gerd
> 
> What I meant was that keyboards/devices don't normally have ways of
> entering the single chars "…", "¼", "½", "¾", "™".
> 
> Names with these might be presented by Garmin software after some
> initial chars have been entered and you can then select the complete
> name that contains these chars.
> 
> I didn't see a good reason to remove the expand for these and find
> some
> arbitrary sort PRIMARY for them. No one has complained about them.
> Also
> cp65001 had over 1000 expands and I really don't want to start
> touching
> these.
> 
> Ticker
> 
> 
> On Mon, 2022-01-10 at 10:29 +, Gerd Petermann wrote:
> > Hi Ticker,
> > 
> > I've committed displaySrt_v2.patch .
> > 
> > I don't fully understand the comment
> > "Leave the above because no method of inputting them anyway and
> > unlikely at start of names."
> > 
> > It is possible to enter these characters in MapSource and I think
> > MapSource uses MDR12
> > when you type only a few characters for the name of a POI and don't
> > pick up an entry from the list.
> > 
> > Gerd
> > 
> > 
> > Von: mkgmap-dev  im Auftrag
> > von
> > Ticker Berkin 
> > Gesendet: Montag, 10. Januar 2022 11:20
> > An: Development list for mkgmap
> > Betreff: Re: [mkgmap-dev] Fix and augment sort definitions
> > 
> > Hi Gerd
> > 
> > I tried various approaches to fixing "Find" when the fixed length
> > Mdr17
> > (maybe also Mdr12) prefix contains sort.expand chars and couldn't
> > make
> > it work. I could documents these attempts in Sort.java if you feel
> > this
> > is worthwhile.
> > 
> > New patch attached that, for cp1252, leaves "ß" as its own PRIMARY
> > after "s". Moved æ,Æ etc to be PRIMARIES on the grounds that their
> > behaviour will be the same as "ß". Made cp1254 consistent as it had
> > similar partial fixes.
> > 
> > The main reason for the patch is to fix all the other sort/cp*.txt
> > files that had line " > #" which was taken as a comment, resulting
> > in
> > "#" being ignored in collation.
> > 
> > With the Display patch (sent previously, but also attached here),
> > it
> > can reproduce the resource/sort file from the binary SRT section.
> > 
> > Ticker
> > 
> > ___
> > mkgmap-dev mailing list
> > mkgmap-dev@lists.mkgmap.org.uk
> > https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
> 
> 
> ___
> mkgmap-dev mailing list
> mkgmap-dev@lists.mkgmap.org.uk
> https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
> ___
> mkgmap-dev mailing list
> mkgmap-dev@lists.mkgmap.org.uk
> https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

Index: src/uk/me/parabola/mkgmap/srt/SrtTextReader.java
=

Re: [mkgmap-dev] Fix and augment sort definitions

2022-01-10 Thread Gerd Petermann
Hi Ticker,

didn't try it: Will mkgmap complain when building an indexed gmapi/gmapsupp
where some tiles where freshly compiled with the new version and others with
an older (like Felix and Carlos do)?

Gerd


Von: mkgmap-dev  im Auftrag von Ticker 
Berkin 
Gesendet: Montag, 10. Januar 2022 12:04
An: Development list for mkgmap
Betreff: Re: [mkgmap-dev] Fix and augment sort definitions

Hi Gerd

What I meant was that keyboards/devices don't normally have ways of
entering the single chars "…", "¼", "½", "¾", "™".

Names with these might be presented by Garmin software after some
initial chars have been entered and you can then select the complete
name that contains these chars.

I didn't see a good reason to remove the expand for these and find some
arbitrary sort PRIMARY for them. No one has complained about them. Also
cp65001 had over 1000 expands and I really don't want to start touching
these.

Ticker


On Mon, 2022-01-10 at 10:29 +, Gerd Petermann wrote:
> Hi Ticker,
>
> I've committed displaySrt_v2.patch .
>
> I don't fully understand the comment
> "Leave the above because no method of inputting them anyway and
> unlikely at start of names."
>
> It is possible to enter these characters in MapSource and I think
> MapSource uses MDR12
> when you type only a few characters for the name of a POI and don't
> pick up an entry from the list.
>
> Gerd
>
> 
> Von: mkgmap-dev  im Auftrag von
> Ticker Berkin 
> Gesendet: Montag, 10. Januar 2022 11:20
> An: Development list for mkgmap
> Betreff: Re: [mkgmap-dev] Fix and augment sort definitions
>
> Hi Gerd
>
> I tried various approaches to fixing "Find" when the fixed length Mdr17
> (maybe also Mdr12) prefix contains sort.expand chars and couldn't make
> it work. I could documents these attempts in Sort.java if you feel this
> is worthwhile.
>
> New patch attached that, for cp1252, leaves "ß" as its own PRIMARY
> after "s". Moved æ,Æ etc to be PRIMARIES on the grounds that their
> behaviour will be the same as "ß". Made cp1254 consistent as it had
> similar partial fixes.
>
> The main reason for the patch is to fix all the other sort/cp*.txt
> files that had line " > #" which was taken as a comment, resulting in
> "#" being ignored in collation.
>
> With the Display patch (sent previously, but also attached here), it
> can reproduce the resource/sort file from the binary SRT section.
>
> Ticker
>
> ___
> mkgmap-dev mailing list
> mkgmap-dev@lists.mkgmap.org.uk
> https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev


___
mkgmap-dev mailing list
mkgmap-dev@lists.mkgmap.org.uk
https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
___
mkgmap-dev mailing list
mkgmap-dev@lists.mkgmap.org.uk
https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev


Re: [mkgmap-dev] Fix and augment sort definitions

2022-01-10 Thread Ticker Berkin
Hi Gerd

What I meant was that keyboards/devices don't normally have ways of
entering the single chars "…", "¼", "½", "¾", "™".

Names with these might be presented by Garmin software after some
initial chars have been entered and you can then select the complete
name that contains these chars.

I didn't see a good reason to remove the expand for these and find some
arbitrary sort PRIMARY for them. No one has complained about them. Also
cp65001 had over 1000 expands and I really don't want to start touching
these.

Ticker


On Mon, 2022-01-10 at 10:29 +, Gerd Petermann wrote:
> Hi Ticker,
> 
> I've committed displaySrt_v2.patch .
> 
> I don't fully understand the comment
> "Leave the above because no method of inputting them anyway and
> unlikely at start of names."
> 
> It is possible to enter these characters in MapSource and I think
> MapSource uses MDR12
> when you type only a few characters for the name of a POI and don't
> pick up an entry from the list.
> 
> Gerd
> 
> 
> Von: mkgmap-dev  im Auftrag von
> Ticker Berkin 
> Gesendet: Montag, 10. Januar 2022 11:20
> An: Development list for mkgmap
> Betreff: Re: [mkgmap-dev] Fix and augment sort definitions
> 
> Hi Gerd
> 
> I tried various approaches to fixing "Find" when the fixed length Mdr17
> (maybe also Mdr12) prefix contains sort.expand chars and couldn't make
> it work. I could documents these attempts in Sort.java if you feel this
> is worthwhile.
> 
> New patch attached that, for cp1252, leaves "ß" as its own PRIMARY
> after "s". Moved æ,Æ etc to be PRIMARIES on the grounds that their
> behaviour will be the same as "ß". Made cp1254 consistent as it had
> similar partial fixes.
> 
> The main reason for the patch is to fix all the other sort/cp*.txt
> files that had line " > #" which was taken as a comment, resulting in
> "#" being ignored in collation.
> 
> With the Display patch (sent previously, but also attached here), it
> can reproduce the resource/sort file from the binary SRT section.
> 
> Ticker
> 
> ___
> mkgmap-dev mailing list
> mkgmap-dev@lists.mkgmap.org.uk
> https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev


___
mkgmap-dev mailing list
mkgmap-dev@lists.mkgmap.org.uk
https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

Re: [mkgmap-dev] Fix and augment sort definitions

2022-01-10 Thread Gerd Petermann
Hi Ticker,

I've committed displaySrt_v2.patch .

I don't fully understand the comment
"Leave the above because no method of inputting them anyway and unlikely at 
start of names."

It is possible to enter these characters in MapSource and I think MapSource 
uses MDR12
when you type only a few characters for the name of a POI and don't pick up an 
entry from the list.

Gerd


Von: mkgmap-dev  im Auftrag von Ticker 
Berkin 
Gesendet: Montag, 10. Januar 2022 11:20
An: Development list for mkgmap
Betreff: Re: [mkgmap-dev] Fix and augment sort definitions

Hi Gerd

I tried various approaches to fixing "Find" when the fixed length Mdr17
(maybe also Mdr12) prefix contains sort.expand chars and couldn't make
it work. I could documents these attempts in Sort.java if you feel this
is worthwhile.

New patch attached that, for cp1252, leaves "ß" as its own PRIMARY
after "s". Moved æ,Æ etc to be PRIMARIES on the grounds that their
behaviour will be the same as "ß". Made cp1254 consistent as it had
similar partial fixes.

The main reason for the patch is to fix all the other sort/cp*.txt
files that had line " > #" which was taken as a comment, resulting in
"#" being ignored in collation.

With the Display patch (sent previously, but also attached here), it
can reproduce the resource/sort file from the binary SRT section.

Ticker

___
mkgmap-dev mailing list
mkgmap-dev@lists.mkgmap.org.uk
https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev


Re: [mkgmap-dev] Fix and augment sort definitions

2022-01-10 Thread Ticker Berkin
Hi Gerd

I tried various approaches to fixing "Find" when the fixed length Mdr17
(maybe also Mdr12) prefix contains sort.expand chars and couldn't make
it work. I could documents these attempts in Sort.java if you feel this
is worthwhile.

New patch attached that, for cp1252, leaves "ß" as its own PRIMARY
after "s". Moved æ,Æ etc to be PRIMARIES on the grounds that their
behaviour will be the same as "ß". Made cp1254 consistent as it had
similar partial fixes.

The main reason for the patch is to fix all the other sort/cp*.txt
files that had line " > #" which was taken as a comment, resulting in
"#" being ignored in collation.

With the Display patch (sent previously, but also attached here), it
can reproduce the resource/sort file from the binary SRT section. 

Ticker

Index: resources/sort/README
===
--- resources/sort/README	(revision 4856)
+++ resources/sort/README	(working copy)
@@ -35,22 +35,24 @@
 I believe that these are arbitary identifiers.  Here is a registry of
 values we are using.  If you make a variation on a code-page
 sort-order then give it a different id2 value.
+It is believed that having sorts with the same id1/id2 but different data loaded
+on the same device will give unexpected results
 
-code-page  id1  id2
+code-page  id1  description
 
-1250   12   1
-12518   1
-12527   2
-1253   13   1
-1254   14   1
-1255   15   1
-1256   16   1
-1257   17   1
-1258   18   1
-87411   1
-932 9   1
-936 5   1
-94910   1
+1250   12   Central European sort
+12518   Cyrillic sort
+12527   Western European sort
+1253   13   Greek sort
+1254   14   Turkish sort
+1255   15   Hebrew sort
+1256   16?9 Arabic sort		cp1256.txt has id1=9, original version of this doc said 16
+1257   17   Latin Baltic sort
+1258   18   Vietnamese sort
+87411   Thai. 8-bit		not implemented
+932 9   Japanese. Shift JIS	not implemented. Note id1=9 used by 1256
+936 5   Simplified Chinese	not implemented
+94910   Korean. Unified Hangui	not implemented
 
-65001  19   4
-0  00
+65001  19   Unicode sort
+0  0ASCII 7-bit sort
Index: resources/sort/cp0.txt
===
--- resources/sort/cp0.txt	(revision 4856)
+++ resources/sort/cp0.txt	(working copy)
@@ -1,9 +1,11 @@
 codepage 0
 id1 0
-id2 1
+# 10-Jan-2022 Increment id2/version. Fix '#' to 0023
+id2 2
 description "ASCII 7-bit sort"
 
 characters
+
 =0008=000e=000f=0010=0011=0012=0013=0014=0015=0016=0017=0018=0019=001a=001b=001c=001d=001e=001f=007f,0001,0002,0003,0004,0005,0006,0007
  < 0009
  < 000a
@@ -32,7 +34,7 @@
  < /
  < \
  < &
- < #
+ < 0023
  < %
  < `
  < ^
@@ -79,3 +81,5 @@
  < x,X
  < y,Y
  < z,Z
+
+# ends
Index: resources/sort/cp1250.txt
===
--- resources/sort/cp1250.txt	(revision 4856)
+++ resources/sort/cp1250.txt	(working copy)
@@ -1,9 +1,11 @@
 codepage 1250
 id1 12
-id2 1
+# 10-Jan-2022 Increment id2/version. Fix '#' to 0023
+id2 2
 description "Central European sort"
 
 characters
+
 =0008=000e=000f=0010=0011=0012=0013=0014=0015=0016=0017=0018=0019=001a=001b=001c=001d=001e=001f=007f=00ad,0001,0002,0003,0004,0005,0006,0007
  < 0009
  < 000a
@@ -45,7 +47,7 @@
  < /
  < \
  < &
- < #
+ < 0023
  < %
  < ‰
  < †
@@ -120,3 +122,5 @@
 expand ˛ to  § 0020
 expand ß to  s s
 expand ™ to  T M
+
+# ends
Index: resources/sort/cp1251.txt
===
--- resources/sort/cp1251.txt	(revision 4856)
+++ resources/sort/cp1251.txt	(working copy)
@@ -1,9 +1,11 @@
 codepage 1251
 id1 8
-id2 1
+# 10-Jan-2022 Increment id2/version. Fix '#' to 0023
+id2 2
 description "Cyrillic sort"
 
 characters
+
 =0008=000e=000f=0010=0011=0012=0013=0014=0015=0016=0017=0018=0019=001a=001b=001c=001d=001e=001f=007f=00ad,0001,0002,0003,0004,0005,0006,0007
  < 0009
  < 000a
@@ -45,7 +47,7 @@
  < /
  < \
  < &
- < #
+ < 0023
  < %
  < ‰
  < †
@@ -152,7 +154,8 @@
  < э,Э
  < ю,Ю
  < я,Я
-
 expand … to  . . .
 expand № to  N o
 expand ™ to  T M
+
+# ends
Index: resources/sort/cp1252.txt
===
--- resources/sort/cp1252.txt	(revision 4856)
+++ resources/sort/cp1252.txt	(working copy)
@@ -1,9 +1,7 @@
-
-
-# This must be first before any 'code' lines.
 codepage 1252
 id1 7
-id2 2
+# 10-Jan-2022 Increment id2/version. Add comment about expansions. Move AE/ae/OE/oe
+id2 3
 description "Western European sort"
 
 characters
@@ -96,7 +94,8 @@
  < 7
  < 8
  < 9
- < a,A,ª ; á,Á ; à,À ; â, ; å,Å ; ä,Ä ; ã,à ; æ,Æ
+ < a,A,ª ; á,Á ; à,À ; â, ; å,Å ; ä,Ä ; ã,Ã
+ < æ,Æ
  < b,B
  < c,C ; ç,Ç
  < d,D ; ð,Ð
@@ -111,7 +110,8 @@
  < l,L
  < m,M
  < n,N ; ñ,Ñ
- < o,O,º ; ó,Ó ; ò,Ò ; ô,Ô ; ö,Ö ; õ,Õ ; ø,Ø ; œ,Œ
+ < o,O,º ; ó,Ó ; ò,Ò ; ô,Ô ; ö,Ö ; õ,Õ 

Re: [mkgmap-dev] Fix and augment sort definitions

2022-01-04 Thread Ticker Berkin
Hi Gerd

On my eTrex 30x:

Same --lower-case map as before. Also loaded is GB map that is disabled
in Setup>Map; it uses same charset & sort, sort has same id1 but
different id2.

Where To? > Cities > Spell Search > "VOS" gives "No Results Found"

Spell Search has option to change the keyboard language. GERMAN allows
input of A/O/U umlaut and ß. "Voß" gives "No Results Found"

So it seems that old HCx works well and the new 30x doesn't.

Tried again after removing the GB map and still fails.

Doing an Address street search (ie skip the city), ß finds ß and ss, ss
doesn't find ß.

I'll investigate and experiment more, mainly MDR17/PrefixIndex.

There is possible problem with sort.collator
compareOneStrengthWithLength() in that it counts expansions whereas the
short string doesn't. So, for prefix length 4, "Voßaaa" and Voßbbb" are
considered equal, but the output is "Voßa" only. 

Maybe the 2/4 char string should be the major PRIMARY representation -
upper case, unaccented, expanded

Ticker


___
mkgmap-dev mailing list
mkgmap-dev@lists.mkgmap.org.uk
https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

Re: [mkgmap-dev] Fix and augment sort definitions

2022-01-04 Thread Ticker Berkin
Hi Gerd

I downloaded niedersachsen-latest.osm.pbf.
Built with current trunk, my resources/sort changes, option --latin
(but not --lower-case).
Loaded gmapsupp onto eTrex HCx
Find>Cities by name "VOSS" and "VOSSBERG" gives me 3 VOSSBERGs, all in
LOWER SAXONY, including the one you mention.
The eTrex name entry has a shift that allows entry of accented/eszet/ae
etc. "VOß" etc finds the same.

Rebuilding with --lower-case, Find>Cities "VOS", "VOSS", "VOSSB" ...
all work, showing the 3 "Voßberg"s.
"VOß", "VOßB" etc also work.

The only strangeness is that the name entry also has a lower case
shift, for both standard latin and the extra chars as mentioned.
Shifting to lower-case, all letters are disabled except "ß".

Find Address doesn't allow entry for this city because none of them
have any streets.

I've just noticed that my version of trunk has mdrUnicode_v9b.patch
applied, but the only significant difference is in MDR25 and will only
effect street cities rather then POI ones.

Will do the same testing on other eTrex

Ticker


On Tue, 2022-01-04 at 11:36 +, Gerd Petermann wrote:
> Hi Ticker,
> 
> OK, maybe you find more on that.
> BTW: Voßberg is very special as it probably also influences the MDR 17
> content.
> 
> I think I'll merge the faster-mp branch into trunk this afternoon and
> continue on
> the Huffman encoding later.
> 
> Gerd
> 
> 
> Von: mkgmap-dev  im Auftrag von
> Ticker Berkin 
> Gesendet: Dienstag, 4. Januar 2022 12:01
> An: Development list for mkgmap
> Betreff: Re: [mkgmap-dev] Fix and augment sort definitions
> 
> Hi Gerd
> 
> I'm just building your area to test on my UK configured devices.
> 
> I'm not sure yet of the benefit of testing with the mdr2 branch.
> Actually, until we've better understood the indexing issues thrown up
> by use of --lower-case, the extra complications of ß seem too much.
> 
> Ticker
> 
> On Tue, 2022-01-04 at 09:50 +, Gerd Petermann wrote:
> > Hi Ticker,
> > 
> > I think you need option -g with svn log to see changes done in
> > branches.
> > 
> > Anyhow, I think I made a mistake back then because I didn't think of
> > devices which
> > are not configured to show a German keyboard (which has keys for äöü
> > and ß).
> > 
> > You are probably right that the expands are better for this case and
> > I don't
> > remember what problems I had with the ß. It is very likely that the
> > open
> > collator strength questions are the better approach.
> > 
> > I test with a tile around my hometown Wildeshausen
> > 55410043: 2447360,389120 to 2469888,407552
> > #   : 52.514648,8.349609 to 52.998047,8.745117
> > 
> > One problem that I found on the Oregon is the search for a
> > name that appears as a nearby city called "Voßberg"
> > https://www.openstreetmap.org/node/599127249
> > 
> > This name doesn't appear in the Oregons basemap.
> > I created a map with --latin and --lower-case.
> > When I search for Voß it is not found. Same when I search
> > for Voßber. Only the search for the full name works.
> > Voss is also not found, but Vossberg works.
> > 
> > Without the patch, search for Voß returns Voßberg,
> > search for Vossberg does that also, which is quite confusing
> > for me.
> > My basemap shows
> > - Description ---
> > ---
> > 0035 | 41 53 43 49 49 20 53 6f | Description: ASCII Sort
> > 003d | 72 74 00    |
> > - Character table header 
> > ---
> > 0040 | 00 | 5c 00   | sub header len 92
> > 0042 | 02 | 01 00           | id1 1
> > 0044 | 04 | 01 00   | id2 1
> > 0046 | 06 | e4 04   | codepage 1252
> > 
> > Maybe I should repeat those tests with the mdr2 branch?
> > 
> > Gerd
> > 
> > 
> > 
> > 
> > 
> > Von: mkgmap-dev  im Auftrag
> > von Ticker Berkin 
> > Gesendet: Dienstag, 4. Januar 2022 09:40
> > An: Development list for mkgmap
> > Betreff: Re: [mkgmap-dev] Fix and augment sort definitions
> > 
> > Hi Gerd
> > 
> > Sorry - I hadn't noticed these changes. They don't show up with
> > $ svn log resources/sort/cp1252.txt or cp1254.txt
> > 
> > All the other mkgmap sort files have all the expansions possible
> > including the eszett and diphthongs if applicab

Re: [mkgmap-dev] Fix and augment sort definitions

2022-01-04 Thread Gerd Petermann
Hi Ticker,

OK, maybe you find more on that.
BTW: Voßberg is very special as it probably also influences the MDR 17 content.

I think I'll merge the faster-mp branch into trunk this afternoon and continue 
on
the Huffman encoding later.

Gerd


Von: mkgmap-dev  im Auftrag von Ticker 
Berkin 
Gesendet: Dienstag, 4. Januar 2022 12:01
An: Development list for mkgmap
Betreff: Re: [mkgmap-dev] Fix and augment sort definitions

Hi Gerd

I'm just building your area to test on my UK configured devices.

I'm not sure yet of the benefit of testing with the mdr2 branch.
Actually, until we've better understood the indexing issues thrown up
by use of --lower-case, the extra complications of ß seem too much.

Ticker

On Tue, 2022-01-04 at 09:50 +, Gerd Petermann wrote:
> Hi Ticker,
>
> I think you need option -g with svn log to see changes done in
> branches.
>
> Anyhow, I think I made a mistake back then because I didn't think of
> devices which
> are not configured to show a German keyboard (which has keys for äöü
> and ß).
>
> You are probably right that the expands are better for this case and
> I don't
> remember what problems I had with the ß. It is very likely that the
> open
> collator strength questions are the better approach.
>
> I test with a tile around my hometown Wildeshausen
> 55410043: 2447360,389120 to 2469888,407552
> #   : 52.514648,8.349609 to 52.998047,8.745117
>
> One problem that I found on the Oregon is the search for a
> name that appears as a nearby city called "Voßberg"
> https://www.openstreetmap.org/node/599127249
>
> This name doesn't appear in the Oregons basemap.
> I created a map with --latin and --lower-case.
> When I search for Voß it is not found. Same when I search
> for Voßber. Only the search for the full name works.
> Voss is also not found, but Vossberg works.
>
> Without the patch, search for Voß returns Voßberg,
> search for Vossberg does that also, which is quite confusing
> for me.
> My basemap shows
> - Description ---
> ---
> 0035 | 41 53 43 49 49 20 53 6f | Description: ASCII Sort
> 003d | 72 74 00|
> - Character table header 
> ---
> 0040 | 00 | 5c 00   | sub header len 92
> 0042 | 02 | 01 00   | id1 1
> 0044 | 04 | 01 00   | id2 1
> 0046 | 06 | e4 04   | codepage 1252
>
> Maybe I should repeat those tests with the mdr2 branch?
>
> Gerd
>
>
>
>
> ____________
> Von: mkgmap-dev  im Auftrag
> von Ticker Berkin 
> Gesendet: Dienstag, 4. Januar 2022 09:40
> An: Development list for mkgmap
> Betreff: Re: [mkgmap-dev] Fix and augment sort definitions
>
> Hi Gerd
>
> Sorry - I hadn't noticed these changes. They don't show up with
> $ svn log resources/sort/cp1252.txt or cp1254.txt
>
> All the other mkgmap sort files have all the expansions possible
> including the eszett and diphthongs if applicable.
>
> The two non-mkgmap sort files (848.SRT/Turkey and
> I3A0.SRT/adriatic TOPO) have expand for "ß" and some of "Œ"... so
> I
> presumed it was expected and reasonably supported.
>
> In the binaries, the expand is expressed as a list of sortOrders
> {primary,secondary,tertiary}. The secondary and tertiary are
> disrupted
> and don't match ones from actual characters (in the case of "ß", the
> two s's get different secondaries). So these double chars will sort
> after the real char and only match with PRIMARY.
>
> As there many unknowns about how to make --lower-case indexing work
> and
> the setting, regarding collation strength, of the bit-flag indicating
> same-name in some of the MDR sections, I feel that it is better to
> have
> the all the expands.
>
> However, if you are against this, I'll redo cp1252 without these
> expansions. I'm not sure of the basis of having the diphthongs as
> alternate secondaries of their first character and the eszett as a
> unique character.
>
> Ticker
>
> On Mon, 2022-01-03 at 11:44 +0100, Gerd Petermann wrote:
> > Hi Ticker,
> >
> > see
> > https://www.mkgmap.org.uk/websvn/revision.php?repname=mkgmap=3948
> > https://www.mkgmap.org.uk/websvn/revision.php?repname=mkgmap=3949
> >
> > the sortResource.patch reverts these changes.
> >
> > In Mapsource the results are a bit better with your patch.
> > I'll try again with my Oregon later.
> >
> > Gerd
> > ___
> > mkgmap-dev mailing list
> >

Re: [mkgmap-dev] Fix and augment sort definitions

2022-01-04 Thread Ticker Berkin
Hi Gerd

I'm just building your area to test on my UK configured devices.

I'm not sure yet of the benefit of testing with the mdr2 branch.
Actually, until we've better understood the indexing issues thrown up
by use of --lower-case, the extra complications of ß seem too much.

Ticker

On Tue, 2022-01-04 at 09:50 +, Gerd Petermann wrote:
> Hi Ticker,
> 
> I think you need option -g with svn log to see changes done in
> branches.
> 
> Anyhow, I think I made a mistake back then because I didn't think of
> devices which
> are not configured to show a German keyboard (which has keys for äöü
> and ß).
> 
> You are probably right that the expands are better for this case and
> I don't
> remember what problems I had with the ß. It is very likely that the
> open
> collator strength questions are the better approach.
> 
> I test with a tile around my hometown Wildeshausen
> 55410043: 2447360,389120 to 2469888,407552
> #   : 52.514648,8.349609 to 52.998047,8.745117
> 
> One problem that I found on the Oregon is the search for a
> name that appears as a nearby city called "Voßberg"
> https://www.openstreetmap.org/node/599127249
> 
> This name doesn't appear in the Oregons basemap.
> I created a map with --latin and --lower-case.
> When I search for Voß it is not found. Same when I search
> for Voßber. Only the search for the full name works.
> Voss is also not found, but Vossberg works.
> 
> Without the patch, search for Voß returns Voßberg,
> search for Vossberg does that also, which is quite confusing
> for me.
> My basemap shows
> - Description ---
> ---
> 0035 | 41 53 43 49 49 20 53 6f | Description: ASCII Sort
> 003d | 72 74 00    |
> - Character table header 
> ---
> 0040 | 00 | 5c 00   | sub header len 92
> 0042 | 02 | 01 00   | id1 1
> 0044 | 04 | 01 00   | id2 1
> 0046 | 06 | e4 04   | codepage 1252
> 
> Maybe I should repeat those tests with the mdr2 branch?
> 
> Gerd
> 
> 
> 
> 
> ____________
> Von: mkgmap-dev  im Auftrag
> von Ticker Berkin 
> Gesendet: Dienstag, 4. Januar 2022 09:40
> An: Development list for mkgmap
> Betreff: Re: [mkgmap-dev] Fix and augment sort definitions
> 
> Hi Gerd
> 
> Sorry - I hadn't noticed these changes. They don't show up with
> $ svn log resources/sort/cp1252.txt or cp1254.txt
> 
> All the other mkgmap sort files have all the expansions possible
> including the eszett and diphthongs if applicable.
> 
> The two non-mkgmap sort files (848.SRT/Turkey and
> I3A0.SRT/adriatic TOPO) have expand for "ß" and some of "Œ"... so
> I
> presumed it was expected and reasonably supported.
> 
> In the binaries, the expand is expressed as a list of sortOrders
> {primary,secondary,tertiary}. The secondary and tertiary are
> disrupted
> and don't match ones from actual characters (in the case of "ß", the
> two s's get different secondaries). So these double chars will sort
> after the real char and only match with PRIMARY.
> 
> As there many unknowns about how to make --lower-case indexing work
> and
> the setting, regarding collation strength, of the bit-flag indicating
> same-name in some of the MDR sections, I feel that it is better to
> have
> the all the expands.
> 
> However, if you are against this, I'll redo cp1252 without these
> expansions. I'm not sure of the basis of having the diphthongs as
> alternate secondaries of their first character and the eszett as a
> unique character.
> 
> Ticker
> 
> On Mon, 2022-01-03 at 11:44 +0100, Gerd Petermann wrote:
> > Hi Ticker,
> > 
> > see
> > https://www.mkgmap.org.uk/websvn/revision.php?repname=mkgmap=3948
> > https://www.mkgmap.org.uk/websvn/revision.php?repname=mkgmap=3949
> > 
> > the sortResource.patch reverts these changes.
> > 
> > In Mapsource the results are a bit better with your patch.
> > I'll try again with my Oregon later.
> > 
> > Gerd
> > ___
> > mkgmap-dev mailing list
> > mkgmap-dev@lists.mkgmap.org.uk
> > https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
> 
> 
> ___
> mkgmap-dev mailing list
> mkgmap-dev@lists.mkgmap.org.uk
> https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
> ___
> mkgmap-dev mailing list
> mkgmap-dev@lists.mkgmap.org.uk
> https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev


___
mkgmap-dev mailing list
mkgmap-dev@lists.mkgmap.org.uk
https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

Re: [mkgmap-dev] Fix and augment sort definitions

2022-01-04 Thread Gerd Petermann
Hi Ticker,

I think you need option -g with svn log to see changes done in branches.

Anyhow, I think I made a mistake back then because I didn't think of devices 
which
are not configured to show a German keyboard (which has keys for äöü and ß).

You are probably right that the expands are better for this case and I don't
remember what problems I had with the ß. It is very likely that the open
collator strength questions are the better approach.

I test with a tile around my hometown Wildeshausen
55410043: 2447360,389120 to 2469888,407552
#   : 52.514648,8.349609 to 52.998047,8.745117

One problem that I found on the Oregon is the search for a
name that appears as a nearby city called "Voßberg"
https://www.openstreetmap.org/node/599127249

This name doesn't appear in the Oregons basemap.
I created a map with --latin and --lower-case.
When I search for Voß it is not found. Same when I search
for Voßber. Only the search for the full name works.
Voss is also not found, but Vossberg works.

Without the patch, search for Voß returns Voßberg,
search for Vossberg does that also, which is quite confusing
for me.
My basemap shows
- Description --
0035 | 41 53 43 49 49 20 53 6f | Description: ASCII Sort
003d | 72 74 00|
- Character table header ---
0040 | 00 | 5c 00   | sub header len 92
0042 | 02 | 01 00   | id1 1
0044 | 04 | 01 00   | id2 1
0046 | 06 | e4 04   | codepage 1252

Maybe I should repeat those tests with the mdr2 branch?

Gerd





Von: mkgmap-dev  im Auftrag von Ticker 
Berkin 
Gesendet: Dienstag, 4. Januar 2022 09:40
An: Development list for mkgmap
Betreff: Re: [mkgmap-dev] Fix and augment sort definitions

Hi Gerd

Sorry - I hadn't noticed these changes. They don't show up with
$ svn log resources/sort/cp1252.txt or cp1254.txt

All the other mkgmap sort files have all the expansions possible
including the eszett and diphthongs if applicable.

The two non-mkgmap sort files (848.SRT/Turkey and
I3A0.SRT/adriatic TOPO) have expand for "ß" and some of "Œ"... so I
presumed it was expected and reasonably supported.

In the binaries, the expand is expressed as a list of sortOrders
{primary,secondary,tertiary}. The secondary and tertiary are disrupted
and don't match ones from actual characters (in the case of "ß", the
two s's get different secondaries). So these double chars will sort
after the real char and only match with PRIMARY.

As there many unknowns about how to make --lower-case indexing work and
the setting, regarding collation strength, of the bit-flag indicating
same-name in some of the MDR sections, I feel that it is better to have
the all the expands.

However, if you are against this, I'll redo cp1252 without these
expansions. I'm not sure of the basis of having the diphthongs as
alternate secondaries of their first character and the eszett as a
unique character.

Ticker

On Mon, 2022-01-03 at 11:44 +0100, Gerd Petermann wrote:
> Hi Ticker,
>
> see
> https://www.mkgmap.org.uk/websvn/revision.php?repname=mkgmap=3948
> https://www.mkgmap.org.uk/websvn/revision.php?repname=mkgmap=3949
>
> the sortResource.patch reverts these changes.
>
> In Mapsource the results are a bit better with your patch.
> I'll try again with my Oregon later.
>
> Gerd
> ___
> mkgmap-dev mailing list
> mkgmap-dev@lists.mkgmap.org.uk
> https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev


___
mkgmap-dev mailing list
mkgmap-dev@lists.mkgmap.org.uk
https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
___
mkgmap-dev mailing list
mkgmap-dev@lists.mkgmap.org.uk
https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev


Re: [mkgmap-dev] Fix and augment sort definitions

2022-01-04 Thread Ticker Berkin
Hi Gerd

Sorry - I hadn't noticed these changes. They don't show up with
$ svn log resources/sort/cp1252.txt or cp1254.txt

All the other mkgmap sort files have all the expansions possible
including the eszett and diphthongs if applicable.

The two non-mkgmap sort files (848.SRT/Turkey and
I3A0.SRT/adriatic TOPO) have expand for "ß" and some of "Œ"... so I
presumed it was expected and reasonably supported.

In the binaries, the expand is expressed as a list of sortOrders 
{primary,secondary,tertiary}. The secondary and tertiary are disrupted
and don't match ones from actual characters (in the case of "ß", the
two s's get different secondaries). So these double chars will sort
after the real char and only match with PRIMARY.

As there many unknowns about how to make --lower-case indexing work and
the setting, regarding collation strength, of the bit-flag indicating
same-name in some of the MDR sections, I feel that it is better to have
the all the expands.

However, if you are against this, I'll redo cp1252 without these
expansions. I'm not sure of the basis of having the diphthongs as
alternate secondaries of their first character and the eszett as a
unique character.

Ticker

On Mon, 2022-01-03 at 11:44 +0100, Gerd Petermann wrote:
> Hi Ticker,
> 
> see
> https://www.mkgmap.org.uk/websvn/revision.php?repname=mkgmap=3948
> https://www.mkgmap.org.uk/websvn/revision.php?repname=mkgmap=3949
> 
> the sortResource.patch reverts these changes.
> 
> In Mapsource the results are a bit better with your patch.
> I'll try again with my Oregon later.
> 
> Gerd
> ___
> mkgmap-dev mailing list
> mkgmap-dev@lists.mkgmap.org.uk
> https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev


___
mkgmap-dev mailing list
mkgmap-dev@lists.mkgmap.org.uk
https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

[mkgmap-dev] Fix and augment sort definitions

2022-01-03 Thread Gerd Petermann
Hi Ticker,

see
https://www.mkgmap.org.uk/websvn/revision.php?repname=mkgmap=3948
https://www.mkgmap.org.uk/websvn/revision.php?repname=mkgmap=3949

the sortResource.patch reverts these changes.

In Mapsource the results are a bit better with your patch.
I'll try again with my Oregon later.

Gerd
___
mkgmap-dev mailing list
mkgmap-dev@lists.mkgmap.org.uk
https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

[mkgmap-dev] Fix and augment sort definitions

2022-01-02 Thread Ticker Berkin
Hi Gerd

In the resource/sort/cp*.txt sort definitions, I notices that all but
cp1252.txt/Western European and cp65001.txt/unicode don't handle "#"
correctly and cp1252.txt/Western European and cp1254.txt/Turkish don't
add expansions for all their double-characters.

Patch attached that fixes these problems, makes the layout consistent,
adds useful information to README and removes the id2 current value,
which just gets out-of-date.

All the id2 values are incremented if appropriate.

The "#" fix might be the reason why searching for street names starting
with # didn't work for code-page=0 / format6

Adding the "ß" expansion might be a step towards resolving the problem
with searching and --lower-case that the subject of discussion a year
or so ago - I can't find the thread.

Ticker

Index: resources/sort/README
===
--- resources/sort/README	(revision 4842)
+++ resources/sort/README	(working copy)
@@ -35,22 +35,24 @@
 I believe that these are arbitary identifiers.  Here is a registry of
 values we are using.  If you make a variation on a code-page
 sort-order then give it a different id2 value.
+It is believed that having sorts with the same id1/id2 but different data loaded
+on the same device will give unexpected results
 
-code-page  id1  id2
+code-page  id1  description
 
-1250   12   1
-12518   1
-12527   2
-1253   13   1
-1254   14   1
-1255   15   1
-1256   16   1
-1257   17   1
-1258   18   1
-87411   1
-932 9   1
-936 5   1
-94910   1
+1250   12   Central European sort
+12518   Cyrillic sort
+12527   Western European sort
+1253   13   Greek sort
+1254   14   Turkish sort
+1255   15   Hebrew sort
+1256   16?9 Arabic sort		cp1256.txt has id1=9, original version of this doc said 16
+1257   17   Latin Baltic sort
+1258   18   Vietnamese sort
+87411   Thai. 8-bit		not implemented
+932 9   Japanese. Shift JIS	not implemented. Note id1=9 used by 1256
+936 5   Simplified Chinese		not implemented
+94910   Korean. Unified Hangui	not implemented
 
-65001  19   4
-0  00
+65001  19   Unicode sort
+0  0ASCII 7-bit sort
Index: resources/sort/cp0.txt
===
--- resources/sort/cp0.txt	(revision 4842)
+++ resources/sort/cp0.txt	(working copy)
@@ -1,9 +1,11 @@
 codepage 0
 id1 0
-id2 1
+# 02-Jan-2022 changed version from 1 to 2. fix '#' to 0023
+id2 2
 description "ASCII 7-bit sort"
 
 characters
+
 =0008=000e=000f=0010=0011=0012=0013=0014=0015=0016=0017=0018=0019=001a=001b=001c=001d=001e=001f=007f,0001,0002,0003,0004,0005,0006,0007
  < 0009
  < 000a
@@ -32,7 +34,7 @@
  < /
  < \
  < &
- < #
+ < 0023
  < %
  < `
  < ^
@@ -79,3 +81,5 @@
  < x,X
  < y,Y
  < z,Z
+
+# ends
Index: resources/sort/cp1250.txt
===
--- resources/sort/cp1250.txt	(revision 4842)
+++ resources/sort/cp1250.txt	(working copy)
@@ -1,9 +1,11 @@
 codepage 1250
 id1 12
-id2 1
+# 02-Jan-2022 changed version from 1 to 2. fix '#' to 0023
+id2 2
 description "Central European sort"
 
 characters
+
 =0008=000e=000f=0010=0011=0012=0013=0014=0015=0016=0017=0018=0019=001a=001b=001c=001d=001e=001f=007f=00ad,0001,0002,0003,0004,0005,0006,0007
  < 0009
  < 000a
@@ -45,7 +47,7 @@
  < /
  < \
  < &
- < #
+ < 0023
  < %
  < ‰
  < †
@@ -120,3 +122,5 @@
 expand ˛ to  § 0020
 expand ß to  s s
 expand ™ to  T M
+
+# ends
Index: resources/sort/cp1251.txt
===
--- resources/sort/cp1251.txt	(revision 4842)
+++ resources/sort/cp1251.txt	(working copy)
@@ -1,9 +1,11 @@
 codepage 1251
 id1 8
-id2 1
+# 02-Jan-2022 changed version from 1 to 2. fix '#' to 0023
+id2 2
 description "Cyrillic sort"
 
 characters
+
 =0008=000e=000f=0010=0011=0012=0013=0014=0015=0016=0017=0018=0019=001a=001b=001c=001d=001e=001f=007f=00ad,0001,0002,0003,0004,0005,0006,0007
  < 0009
  < 000a
@@ -45,7 +47,7 @@
  < /
  < \
  < &
- < #
+ < 0023
  < %
  < ‰
  < †
@@ -152,7 +154,8 @@
  < э,Э
  < ю,Ю
  < я,Я
-
 expand … to  . . .
 expand № to  N o
 expand ™ to  T M
+
+# ends
Index: resources/sort/cp1252.txt
===
--- resources/sort/cp1252.txt	(revision 4842)
+++ resources/sort/cp1252.txt	(working copy)
@@ -1,9 +1,7 @@
-
-
-# This must be first before any 'code' lines.
 codepage 1252
 id1 7
-id2 2
+# 15-Dec-2021 changed version from 2 to 3. Add expansions for AE/ae/OE/oe/ss
+id2 3
 description "Western European sort"
 
 characters
@@ -96,7 +94,7 @@
  < 7
  < 8
  < 9
- < a,A,ª ; á,Á ; à,À ; â, ; å,Å ; ä,Ä ; ã,à ; æ,Æ
+ < a,A,ª ; á,Á ; à,À ; â, ; å,Å ; ä,Ä ; ã,Ã
  < b,B
  < c,C ; ç,Ç
  < d,D ; ð,Ð
@@ -111,12 +109,11 @@
  < l,L
  < m,M
  < n,N ; ñ,Ñ
- < o,O,º ; ó,Ó ; ò,Ò ; ô,Ô ; ö,Ö ; õ,Õ ; ø,Ø ; œ,Œ
+ < o,O,º ; ó,Ó ;