Re: NetBSD sort l10n: I give up!

2002-04-07 Thread Jeroen Ruigrok/asmodai

-On [20020407 07:00], Andrey A. Chernov ([EMAIL PROTECTED]) wrote:
So, I plan to remove all vestiges of NetBSD sort and ask to restore GNU 
sort from the Attic. Reasons are:

Better option:

1) leave NetBSD sort
2) unhook from build
3) add GNU sort back for now
4) fix up NetBSD sort

That you are unable doesn't mean others are unable as well. :)

-- 
Jeroen Ruigrok van der Werven / asmodai / Kita no Mono
asmodai@[wxs.nl|xmach.org], finger [EMAIL PROTECTED]
http://www.softweyr.com/asmodai/ | http://www.[tendra|xmach].org/
Resolve to find thyself; and to know that he who finds himself, loses
his misery...

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: NetBSD sort l10n: I give up!

2002-04-07 Thread Andrey A. Chernov

On Sun, Apr 07, 2002 at 11:48:15 +0200, Jeroen Ruigrok/asmodai wrote:
 -On [20020407 07:00], Andrey A. Chernov ([EMAIL PROTECTED]) wrote:
 So, I plan to remove all vestiges of NetBSD sort and ask to restore GNU 
 sort from the Attic. Reasons are:
 
 Better option:
 
 1) leave NetBSD sort
 2) unhook from build
 3) add GNU sort back for now

It is not better but the same as mine. I don't plan to remove inactive
contrib stuff.

 4) fix up NetBSD sort
 
 That you are unable doesn't mean others are unable as well. :)

In theory, yes, but in practice I sure that nobody ever can fix it 
without total code flow redesign.

-- 
Andrey A. Chernov
http://ache.pp.ru/

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: NetBSD sort l10n: I give up!

2002-04-07 Thread Jeroen Ruigrok/asmodai

-On [20020407 12:00], Andrey A. Chernov ([EMAIL PROTECTED]) wrote:
On Sun, Apr 07, 2002 at 11:48:15 +0200, Jeroen Ruigrok/asmodai wrote:
 -On [20020407 07:00], Andrey A. Chernov ([EMAIL PROTECTED]) wrote:
 So, I plan to remove all vestiges of NetBSD sort and ask to restore GNU 
 sort from the Attic. Reasons are:
 
 Better option:
 
 1) leave NetBSD sort
 2) unhook from build
 3) add GNU sort back for now

It is not better but the same as mine. I don't plan to remove inactive
contrib stuff.

That was not what you said in your initial suggestion:

``I plan to remove all vestiges of NetBSD sort'', that really sounds, to me,
as if you were going to cvs rm it.

So, to get it clear, it will remain in contrib?

-- 
Jeroen Ruigrok van der Werven / asmodai / Kita no Mono
asmodai@[wxs.nl|xmach.org], finger [EMAIL PROTECTED]
http://www.softweyr.com/asmodai/ | http://www.[tendra|xmach].org/
And I'm learning the highs and lows of the fake promises...

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: NetBSD sort l10n: I give up!

2002-04-07 Thread Andrey A. Chernov

On Sun, Apr 07, 2002 at 12:13:50 +0200, Jeroen Ruigrok/asmodai wrote:

 It is not better but the same as mine. I don't plan to remove inactive
 contrib stuff.
 
 That was not what you said in your initial suggestion:
 
 ``I plan to remove all vestiges of NetBSD sort'', that really sounds, to me,
 as if you were going to cvs rm it.
 
 So, to get it clear, it will remain in contrib?

Sorry if I was unclear, I mean functionality. Yes, it will remains in the
contrib, if somebody needs it, I am not picky about inactive stuff. If you
notice my second (after give up) message, I even suggest to install it
under different name, if someone wants it.

-- 
Andrey A. Chernov
http://ache.pp.ru/

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: NetBSD sort l10n: I give up!

2002-04-07 Thread Tim J. Robbins

Here is a patch to make NetBSD's sort(1) sort by the locale's collating
order. The table should not be called ascii[] anymore, but I can't think of
a better one, and supplying a patch to change the name would be pointless.

It works. It assumes the string strxfrm() outputs is the same length as
its input, which is always possible, and true on FreeBSD.

$ env LC_COLLATE=fr_FR.ISO8859-1 sort test.fr | rs
Ète   elle
$ env LC_COLLATE=fr_FR.ISO8859-1 ./sort test.fr | rs
Ète   elle
$ rs test.fr
elle  Ète

Enjoy (?)


Tim


Index: init.c
===
RCS file: /home/ncvs/src/contrib/sort/init.c,v
retrieving revision 1.2
diff -u -r1.2 init.c
--- init.c  2002/04/07 00:49:00 1.2
+++ init.c  2002/04/07 10:29:59
@@ -46,6 +46,7 @@
 #endif /* not lint */
 
 #include ctype.h
+#include err.h
 #include string.h
 
 static void insertcol __P((struct field *));
@@ -291,8 +292,7 @@
  * Note: when sorting in forward order, to encode character zero in a key,
  * use \001\001; character 1 becomes \001\002.  In this case, character 0
  * is reserved for the field delimiter.  Analagously for -r (fld_d = 255).
- * Note: this is only good for ASCII sorting.  For different LC 's,
- * all bets are off.  See also num_init in number.c
+ * See also num_init in number.c
  */
 void
 settables(gflags)
@@ -300,8 +300,20 @@
 {
u_char *wts;
int i, incr;
+   static int warned;
+   char abuf[2], xbuf[8];
+
+   abuf[1] = '\0';
for (i=0; i  256; i++) {
-   ascii[i] = i;
+   if (i != 0) {
+   *abuf = i;
+   if (strxfrm(xbuf, abuf, sizeof(xbuf))  1  !warned) {
+   warnx(collating order too complicated);
+   warned = 1;
+   }
+   ascii[i] = *xbuf;
+   } else
+   ascii[i] = 0;
if (i  REC_D  i  255 - REC_D+1)
Rascii[i] = 255 - i + 1;
else

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: NetBSD sort l10n: I give up!

2002-04-07 Thread Andrey A. Chernov

On Sun, Apr 07, 2002 at 20:40:13 +1000, Tim J. Robbins wrote:
 
 It works. It assumes the string strxfrm() outputs is the same length as
 its input, which is always possible, and true on FreeBSD.

It seems you try follow the same path as me :-)
No, it not works since breaks so many other places.
Please run some tests before posting the first idea comes into mind.

I suggest following test first:
none,-r,-f,-n combination for all FreeBSD locales compared to GNU sort.

The next test is -R option in 0.255 range for all locales.

Before you end up building correct tables for ascii,Rascii,Ftable,RFtable,
I can inform you that correct tables for them breaks -n badly.

-- 
Andrey A. Chernov
http://ache.pp.ru/

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: NetBSD sort l10n: I give up!

2002-04-07 Thread Andrey A. Chernov

On Sun, Apr 07, 2002 at 14:55:37 +0400, Andrey A. Chernov wrote:
 
 Before you end up building correct tables for ascii,Rascii,Ftable,RFtable,
 I can inform you that correct tables for them breaks -n badly.

I can additionly notice that building correct tables for Ftable and
RFtable is especially hard because conflicts appearse due to duplicated
lower-upper characters ranges and must be resolved by additional shifting
from REC_D to unknown direction which may be not possible as single pass
(i.e. not overwriting REC_D again) operation for given locale.

-- 
Andrey A. Chernov
http://ache.pp.ru/

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: NetBSD sort l10n: I give up!

2002-04-07 Thread David O'Brien

On Sun, Apr 07, 2002 at 02:55:37PM +0400, Andrey A. Chernov wrote:
 I suggest following test first:
 none,-r,-f,-n combination for all FreeBSD locales compared to GNU sort.
 
 The next test is -R option in 0.255 range for all locales.

Perhaps you could make a test suite and commit to
[gnu/]usr.bin/sort/testsuite ?

Try:  cd /usr/src/usr.bin/bzip2 ; make all test

This way one would know when you would be happy with a GNU sort
replacement.

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: NetBSD sort l10n: I give up!

2002-04-07 Thread Andrey A. Chernov

On Sun, Apr 07, 2002 at 04:30:31 -0700, David O'Brien wrote:
 On Sun, Apr 07, 2002 at 02:55:37PM +0400, Andrey A. Chernov wrote:
  I suggest following test first:
  none,-r,-f,-n combination for all FreeBSD locales compared to GNU sort.
  
  The next test is -R option in 0.255 range for all locales.
 
 Perhaps you could make a test suite and commit to
 [gnu/]usr.bin/sort/testsuite ?
 

Yes, after GNU sort will be restored. I already send request to cvs@

-- 
Andrey A. Chernov
http://ache.pp.ru/

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: NetBSD sort l10n: I give up!

2002-04-07 Thread Dmitry Sivachenko

On Sun, Apr 07, 2002 at 10:00:08AM +0400, Andrey A. Chernov wrote:
 On Sun, Apr 07, 2002 at 08:52:21 +0400, Andrey A. Chernov wrote:
  It is sad news, but I try to do my best to l10n NetBSD sort in vain, it is 
  tied to ASCII so closely so it is almost impossible to handle all possible 
  cases without imbedding AI code far bigger then whole sort.
  
  So, I plan to remove all vestiges of NetBSD sort and ask to restore GNU 
  sort from the Attic. Reasons are:
 
 For people who needs exact NetBSD sort functionality and don't needs l10n
 (if they exists) NetBSD sort can be installed under different name like 
 ascii_sort or bsort.
 

... and from ports.

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: NetBSD sort l10n: I give up!

2002-04-07 Thread Andrey A. Chernov

On Sun, Apr 07, 2002 at 21:49:44 +1000, Tim J. Robbins wrote:
 On Sun, Apr 07, 2002 at 02:55:37PM +0400, Andrey A. Chernov wrote:
 
  No, it not works since breaks so many other places.
 
 I guess I have to agree with you there, that it does break -n and -f and does
 not handle (for example) German correctly. I still do believe that a similar
 approach could correctly handle all the ISO8859 character sets, only it's not
 as simple as it seems.

I think so too, initially. I even have correct
ascii,Rascii,Ftable,RFtable,gweights tables in my last committed variant
(but not for various -R). Nope. -n broke them all because it hardcoded to
ASCII but sorted in the modified (collated) order.. Back permutation table
not helps because of different forms of main (collated) order
corresponding to -r -f flags. Via some hacking I even made variant without
-R works for -n too, but it not means it will not be broken for any future
locale we can have.

-- 
Andrey A. Chernov
http://ache.pp.ru/

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: NetBSD sort l10n: I give up!

2002-04-07 Thread Dag-Erling Smorgrav

Andrey A. Chernov [EMAIL PROTECTED] writes:
 So, I plan to remove all vestiges of NetBSD sort and ask to restore GNU 
 sort from the Attic.

Fair enough.  I don't care as long as it sorts right.

DES
-- 
Dag-Erling Smorgrav - [EMAIL PROTECTED]

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: NetBSD sort l10n: I give up!

2002-04-07 Thread Dag-Erling Smorgrav

Dag-Erling Smorgrav [EMAIL PROTECTED] writes:
 Andrey A. Chernov [EMAIL PROTECTED] writes:
  So, I plan to remove all vestiges of NetBSD sort and ask to restore GNU 
  sort from the Attic.
 Fair enough.  I don't care as long as it sorts right.

I must apologize for reacting the way I did, BTW.  I shouldn't have
made those commits; I realize now that I was acting in anger and with
prejudice, which is never a good frame of mind for doing FreeBSD work.

DES
-- 
Dag-Erling Smorgrav - [EMAIL PROTECTED]

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: NetBSD sort l10n: I give up!

2002-04-07 Thread Jeroen Ruigrok/asmodai

-On [20020407 12:30], Andrey A. Chernov ([EMAIL PROTECTED]) wrote:
Sorry if I was unclear, I mean functionality. Yes, it will remains in the
contrib, if somebody needs it, I am not picky about inactive stuff. If you
notice my second (after give up) message, I even suggest to install it
under different name, if someone wants it.

Tim J Robbins whipped up some code which seems to take us to the same level
as GNU sort, as far as we could see.

As present the GNU sort we have doesn't seem to be able to handle multibyte
and/or shift states, does it?  As far as he and I could see it was only
8-bit limited.  And his work gives that to the NetBSD sort as well.

-- 
Jeroen Ruigrok van der Werven / asmodai / Kita no Mono
asmodai@[wxs.nl|xmach.org], finger [EMAIL PROTECTED]
http://www.softweyr.com/asmodai/ | http://www.[tendra|xmach].org/
Life can only be understood backwards, but it must be lived forwards...

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: NetBSD sort l10n: I give up!

2002-04-07 Thread Andrey A. Chernov

On Sun, Apr 07, 2002 at 15:32:18 +0200, Jeroen Ruigrok/asmodai wrote:
 -On [20020407 12:30], Andrey A. Chernov ([EMAIL PROTECTED]) wrote:
 Sorry if I was unclear, I mean functionality. Yes, it will remains in the
 contrib, if somebody needs it, I am not picky about inactive stuff. If you
 notice my second (after give up) message, I even suggest to install it
 under different name, if someone wants it.
 
 Tim J Robbins whipped up some code which seems to take us to the same level
 as GNU sort, as far as we could see.

What code you mean? If you mean the patch he post, the patch is obviously 
wrong and not pass even simplest tests. It reminds my very early attempts.

 As present the GNU sort we have doesn't seem to be able to handle multibyte
 and/or shift states, does it?  As far as he and I could see it was only
 8-bit limited.  And his work gives that to the NetBSD sort as well.

Yes, both variants (i.e. NetBSD sort too, if it will be fixed) limited to
8bit, as 99% of other base-localized soft. Multibyte l10n is completely
different thing. What his work you mean? I answer him, read this
discussion to the end. He admit that his change is wrong.

-- 
Andrey A. Chernov
http://ache.pp.ru/

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: NetBSD sort l10n: I give up!

2002-04-07 Thread Peter Wemm

Andrey A. Chernov wrote:
 On Sun, Apr 07, 2002 at 04:30:31 -0700, David O'Brien wrote:
  On Sun, Apr 07, 2002 at 02:55:37PM +0400, Andrey A. Chernov wrote:
   I suggest following test first:
   none,-r,-f,-n combination for all FreeBSD locales compared to GNU sort.
   
   The next test is -R option in 0.255 range for all locales.
  
  Perhaps you could make a test suite and commit to
  [gnu/]usr.bin/sort/testsuite ?
  
 
 Yes, after GNU sort will be restored. I already send request to cvs@

There is no need for cvs@ to be involved here.  Just get the old pre-rm
files and 'cvs add' them back again.  There is nothing significant that
is still on the vendor branch that is worth messing around with.

Cheers,
-Peter
--
Peter Wemm - [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]
All of this is for nothing if we don't go to the stars - JMS/B5


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: NetBSD sort l10n: I give up!

2002-04-06 Thread Andrey A. Chernov

On Sun, Apr 07, 2002 at 08:52:21 +0400, Andrey A. Chernov wrote:
 It is sad news, but I try to do my best to l10n NetBSD sort in vain, it is 
 tied to ASCII so closely so it is almost impossible to handle all possible 
 cases without imbedding AI code far bigger then whole sort.
 
 So, I plan to remove all vestiges of NetBSD sort and ask to restore GNU 
 sort from the Attic. Reasons are:

For people who needs exact NetBSD sort functionality and don't needs l10n
(if they exists) NetBSD sort can be installed under different name like 
ascii_sort or bsort.

-- 
Andrey A. Chernov
http://ache.pp.ru/

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message