Re: CFT: BSD-licensed grep [Fwd: cvs commit: ports/textproc/bsdgrep Makefile distinfo]

2008-07-08 Thread Kris Kennaway
Gábor Kövesdán wrote: Well, it seems you have missed the first nits of the discussion. GNU grep has some regression test, which doesn't pass completely itself either. :) I've mentioned here that I used those tests to find out what incompatible options are there. Unfortunately, I have to say

Re: CFT: BSD-licensed grep [Fwd: cvs commit: ports/textproc/bsdgrep Makefile distinfo]

2008-07-08 Thread Gábor Kövesdán
Well, it seems you have missed the first nits of the discussion. GNU grep has some regression test, which doesn't pass completely itself either. :) I've mentioned here that I used those tests to find out what incompatible options are there. Unfortunately, I have to say that BSD grep won't

Re: CFT: BSD-licensed grep [Fwd: cvs commit: ports/textproc/bsdgrep Makefile distinfo]

2008-07-07 Thread Kris Kennaway
Maxim Sobolev wrote: Dag-Erling Smørgrav wrote: Andrey Chernov [EMAIL PROTECTED] writes: BSD sort as an idea will be a good project indeed, but BSD sort implementation we currently have at hand is totally misleading and should be rewritten from the scratch, I realize it when long time ago I

Re: CFT: BSD-licensed grep [Fwd: cvs commit: ports/textproc/bsdgrep Makefile distinfo]

2008-07-07 Thread Andrey Chernov
On Mon, Jul 07, 2008 at 10:06:31PM +0200, Kris Kennaway wrote: What regression suites do other implementations have? e.g. the GNU textutils. They basically have regex tests, but nothing locale specific, since locale ordering is different from platform to platform (until Unicode Collation

Re: CFT: BSD-licensed grep [Fwd: cvs commit: ports/textproc/bsdgrep Makefile distinfo]

2008-07-07 Thread Kris Kennaway
Andrey Chernov wrote: On Mon, Jul 07, 2008 at 10:06:31PM +0200, Kris Kennaway wrote: What regression suites do other implementations have? e.g. the GNU textutils. They basically have regex tests, but nothing locale specific, since locale ordering is different from platform to platform

Re: CFT: BSD-licensed grep [Fwd: cvs commit: ports/textproc/bsdgrep Makefile distinfo]

2008-07-07 Thread Gabor Kovesdan
Kris Kennaway escribió: Andrey Chernov wrote: On Mon, Jul 07, 2008 at 10:06:31PM +0200, Kris Kennaway wrote: What regression suites do other implementations have? e.g. the GNU textutils. They basically have regex tests, but nothing locale specific, since locale ordering is different from

Re: CFT: BSD-licensed grep [Fwd: cvs commit: ports/textproc/bsdgrep Makefile distinfo]

2008-06-24 Thread Gabor Kovesdan
1) You can't convert just whole buffer after fread() since it can be ended in the middle of multibyte sequence on BUFSIZ edge. Look how GNU utils do it. OK, now I haven't thought of this aspect. What about this? #define iswbinary(ch) (!iswspace((ch)) iswcntrl((ch))) int

Re: CFT: BSD-licensed grep [Fwd: cvs commit: ports/textproc/bsdgrep Makefile distinfo]

2008-06-24 Thread Andrey Chernov
On Tue, Jun 24, 2008 at 10:32:17PM +0200, Gabor Kovesdan wrote: ch = fgetwc(f); You must clear errno before and handle EILSEQ possible coming after fgetwc() somehow. Perhaps by return ret = 1 (binary), I am not sure. fgetwc() returns WEOF in that case which is not true end of

Re: CFT: BSD-licensed grep [Fwd: cvs commit: ports/textproc/bsdgrep Makefile distinfo]

2008-06-24 Thread Andrey Chernov
On Wed, Jun 25, 2008 at 01:04:20AM +0400, Andrey Chernov wrote: if ((s = mbstowcs(NULL, f-base, 0)) == -1) return (0); The same here. Check EILSEQ and return 1 BTW, do you realyze that this code malloc()s _whole_file_ into memory (which not fits for very big

Re: CFT: BSD-licensed grep [Fwd: cvs commit: ports/textproc/bsdgrep Makefile distinfo]

2008-06-22 Thread Gabor Kovesdan
Andrey Chernov escribió: On Wed, Jun 18, 2008 at 12:40:24PM +0200, Dag-Erling Sm??rgrav wrote: For grep, I believe it should simply be a matter of calling setlocale(), using wide strings, and using a multibyte regex engine (for appropriate values of simply). See my prev reply telling

Re: CFT: BSD-licensed grep [Fwd: cvs commit: ports/textproc/bsdgrep Makefile distinfo]

2008-06-22 Thread Andrey Chernov
On Sun, Jun 22, 2008 at 02:58:17PM +0200, Gabor Kovesdan wrote: Andrey Chernov escribi?: On Wed, Jun 18, 2008 at 12:40:24PM +0200, Dag-Erling Sm??rgrav wrote: For grep, I believe it should simply be a matter of calling setlocale(), using wide strings, and using a multibyte regex engine

Re: CFT: BSD-licensed grep [Fwd: cvs commit: ports/textproc/bsdgrep Makefile distinfo]

2008-06-19 Thread Konrad Jankowski
Maxim Sobolev wrote: Good regression test suite which would include cases in different single and multi-byte locates for grep/sort/etc could also be a big help. I will implement test cases for sort in UTF-8 as part of my project. ___

Re: CFT: BSD-licensed grep [Fwd: cvs commit: ports/textproc/bsdgrep Makefile distinfo]

2008-06-19 Thread Dag-Erling Smørgrav
Konrad Jankowski [EMAIL PROTECTED] writes: BOM's should be handled at the program level. Yeah, that makes sense; libc has no way of knowing whether the start of the string you're processing is actually the start of the file. DES -- Dag-Erling Smørgrav - [EMAIL PROTECTED]

Re: CFT: BSD-licensed grep [Fwd: cvs commit: ports/textproc/bsdgrep Makefile distinfo]

2008-06-18 Thread Dag-Erling Smørgrav
Andrey Chernov [EMAIL PROTECTED] writes: BSD sort as an idea will be a good project indeed, but BSD sort implementation we currently have at hand is totally misleading and should be rewritten from the scratch, I realize it when long time ago I try to localize it for single byte locales. I

Re: CFT: BSD-licensed grep [Fwd: cvs commit: ports/textproc/bsdgrep Makefile distinfo]

2008-06-18 Thread Andrey Chernov
On Tue, Jun 17, 2008 at 12:58:12PM +0200, Gabor Kovesdan wrote: Yes, and once this is done, sort will work out of he box, if it uses strcoll. Already tried on a prototype. Only GNU sort for multibyte chars. BSD sort is programmed too badly and can't be fixed even for single byte

Re: CFT: BSD-licensed grep [Fwd: cvs commit: ports/textproc/bsdgrep Makefile distinfo]

2008-06-18 Thread Andrey Chernov
On Wed, Jun 18, 2008 at 10:22:31AM +0200, Dag-Erling Sm??rgrav wrote: I think part of the problem is that there aren't enough people who truly understand localization. I think I understand most of it, but I'm pretty sure I *don't* understand how collation works, or is supposed to work.

Re: CFT: BSD-licensed grep [Fwd: cvs commit: ports/textproc/bsdgrep Makefile distinfo]

2008-06-18 Thread Dag-Erling Smørgrav
Andrey Chernov [EMAIL PROTECTED] writes: Single byte locales collation works through strcoll() via chains, i.e. seek all chains starting with given letter. Multibyte locales collation currently is not implemented and can't be properly implemented under existen single byte framework (it will

Re: CFT: BSD-licensed grep [Fwd: cvs commit: ports/textproc/bsdgrep Makefile distinfo]

2008-06-18 Thread Dag-Erling Smørgrav
Konrad Jankowski [EMAIL PROTECTED] writes: Dag-Erling Smørgrav [EMAIL PROTECTED] writes: In any case, this is a libc issue, right? As long as sort / grep uses the API correctly, they will work fine once libc is fixed? Correct. Given sort uses strcoll()/wcscoll()/strxfrm()/wcsxfrm() and

Re: CFT: BSD-licensed grep [Fwd: cvs commit: ports/textproc/bsdgrep Makefile distinfo]

2008-06-18 Thread Andrey Chernov
On Wed, Jun 18, 2008 at 11:39:10AM +0200, Dag-Erling Sm??rgrav wrote: Does that mean our wcsxfrm() doesn't work? IIUC, it should convert wide strings to strings that can be compared directly with strcmp()? (directly with wcscmp()) For single byte locales wcsxfrm() and wcscoll() works, but for

Re: CFT: BSD-licensed grep [Fwd: cvs commit: ports/textproc/bsdgrep Makefile distinfo]

2008-06-18 Thread Andrey Chernov
On Wed, Jun 18, 2008 at 12:40:24PM +0200, Dag-Erling Sm??rgrav wrote: For grep, I believe it should simply be a matter of calling setlocale(), using wide strings, and using a multibyte regex engine (for appropriate values of simply). See my prev reply telling more details. Using wide strings

Re: CFT: BSD-licensed grep [Fwd: cvs commit: ports/textproc/bsdgrep Makefile distinfo]

2008-06-18 Thread Andrey Chernov
On Wed, Jun 18, 2008 at 11:14:16AM +0200, Konrad Jankowski wrote: I think the best place for this type of information is currently my SoC wiki. http://wiki.freebsd.org/KonradJankowski/Collation I know currently it has very little information, however. I can also create another page dedicated

Re: CFT: BSD-licensed grep [Fwd: cvs commit: ports/textproc/bsdgrep Makefile distinfo]

2008-06-18 Thread Sean C. Farley
On Mon, 16 Jun 2008, Dag-Erling Smørgrav wrote: Doug Barton [EMAIL PROTECTED] writes: Andrey Chernov [EMAIL PROTECTED] writes: Please note that BSD grep is not localized (and can't be per design) and works only with standard C locale. It may not affect ports system processing but shurely

Re: CFT: BSD-licensed grep [Fwd: cvs commit: ports/textproc/bsdgrep Makefile distinfo]

2008-06-18 Thread Maxim Sobolev
Dag-Erling Smørgrav wrote: Andrey Chernov [EMAIL PROTECTED] writes: BSD sort as an idea will be a good project indeed, but BSD sort implementation we currently have at hand is totally misleading and should be rewritten from the scratch, I realize it when long time ago I try to localize it for

Re: CFT: BSD-licensed grep [Fwd: cvs commit: ports/textproc/bsdgrep Makefile distinfo]

2008-06-17 Thread Gabor Kovesdan
Andrey Chernov escribió: On Tue, Jun 17, 2008 at 04:28:10AM +0400, Andrey Chernov wrote: BSD grep is even not bothering to call setlocale(). I can't say is it can be simple healed by adding that call, some test suite run is needed. Quick source inspection reveals that BSD grep

Re: CFT: BSD-licensed grep [Fwd: cvs commit: ports/textproc/bsdgrep Makefile distinfo]

2008-06-17 Thread Dag-Erling Smørgrav
Andrey Chernov [EMAIL PROTECTED] writes: Dag-Erling Smørgrav [EMAIL PROTECTED] writes: We don't have a locale-aware regex implementation. Henry Spencer wrote one for Tcl 8, and it seems to be under an MIT-equivalent license, but I'm not sure how hard it would be to extirpate. It might

Re: CFT: BSD-licensed grep [Fwd: cvs commit: ports/textproc/bsdgrep Makefile distinfo]

2008-06-17 Thread Andrey Chernov
On Tue, Jun 17, 2008 at 09:21:52AM +0200, Gabor Kovesdan wrote: Sorry for the possibly silly question, but what we mean localization here in the case of grep? As far as I see, it works with wide chars, because the regex library is aware of those. What other aspect needs to be taken into

Re: CFT: BSD-licensed grep [Fwd: cvs commit: ports/textproc/bsdgrep Makefile distinfo]

2008-06-17 Thread Andrey Chernov
On Tue, Jun 17, 2008 at 11:46:07AM +0400, Andrey Chernov wrote: On Tue, Jun 17, 2008 at 09:21:52AM +0200, Gabor Kovesdan wrote: Sorry for the possibly silly question, but what we mean localization here in the case of grep? As far as I see, it works with wide chars, because the regex

Re: CFT: BSD-licensed grep [Fwd: cvs commit: ports/textproc/bsdgrep Makefile distinfo]

2008-06-17 Thread Diomidis Spinellis
Gabor Kovesdan wrote: In case of sort, I understarnd that it should explicitly handle wide characters due to the different alphabet of the different languages and yes, that seems to be a difficult task... Note that Konrad Jankowski in another SoC project is adding to our C library support

Re: CFT: BSD-licensed grep [Fwd: cvs commit: ports/textproc/bsdgrep Makefile distinfo]

2008-06-17 Thread Andrey Chernov
On Tue, Jun 17, 2008 at 12:08:38PM +0200, Dag-Erling Sm??rgrav wrote: I hadn't noticed... ISTR it was an issue back when jphoward wrote his BSD-licensed grep. BSD grep have enough (but not fatal, as BSD sort) problems even with single byte locales we support initially in our regex (old

Re: CFT: BSD-licensed grep [Fwd: cvs commit: ports/textproc/bsdgrep Makefile distinfo]

2008-06-17 Thread Konrad Jankowski
Diomidis Spinellis wrote: Gabor Kovesdan wrote: In case of sort, I understarnd that it should explicitly handle wide characters due to the different alphabet of the different languages and yes, that seems to be a difficult task... Note that Konrad Jankowski in another SoC project is adding

Re: CFT: BSD-licensed grep [Fwd: cvs commit: ports/textproc/bsdgrep Makefile distinfo]

2008-06-17 Thread Andrey Chernov
On Tue, Jun 17, 2008 at 10:54:42AM +0200, Konrad Jankowski wrote: Diomidis Spinellis wrote: Gabor Kovesdan wrote: In case of sort, I understarnd that it should explicitly handle wide characters due to the different alphabet of the different languages and yes, that seems to be a

Re: CFT: BSD-licensed grep [Fwd: cvs commit: ports/textproc/bsdgrep Makefile distinfo]

2008-06-17 Thread Gabor Kovesdan
Andrey Chernov escribió: On Tue, Jun 17, 2008 at 10:54:42AM +0200, Konrad Jankowski wrote: Diomidis Spinellis wrote: Gabor Kovesdan wrote: In case of sort, I understarnd that it should explicitly handle wide characters due to the different alphabet of the different languages

Re: CFT: BSD-licensed grep [Fwd: cvs commit: ports/textproc/bsdgrep Makefile distinfo]

2008-06-17 Thread Gabor Kovesdan
Doug Barton escribió: I use the following construct in portmaster, where pdb=/var/db/pkg, origin is set to the origin of a given port, and ro_opd is usually empty, but can be another origin directory or the same one. To guarantee that you should get some kind of results you can test with

Re: CFT: BSD-licensed grep [Fwd: cvs commit: ports/textproc/bsdgrep Makefile distinfo]

2008-06-17 Thread Gabor Kovesdan
Doug Barton escribió: I use the following construct in portmaster, where pdb=/var/db/pkg, origin is set to the origin of a given port, and ro_opd is usually empty, but can be another origin directory or the same one. To guarantee that you should get some kind of results you can test with

Re: CFT: BSD-licensed grep [Fwd: cvs commit: ports/textproc/bsdgrep Makefile distinfo]

2008-06-17 Thread Jaakko Heinonen
On 2008-06-17, Gabor Kovesdan wrote: egrep: empty (sub)expression I've looked at this and I have a patch with a workaround: http://kovesdan.org/patches/grep.dougb.diff Unfortunately this breaks things. For example: $ grep -E '(test||test)' /dev/null grep: parentheses not balanced $ grep

Re: CFT: BSD-licensed grep [Fwd: cvs commit: ports/textproc/bsdgrep Makefile distinfo]

2008-06-16 Thread Joerg Sonnenberger
On Sun, Jun 15, 2008 at 09:11:36PM -0700, Garrett Cooper wrote: Now all we need to do is write / import a BSD compatible less(1) into FreeBSD =). less is dual licensed. Joerg ___ freebsd-hackers@freebsd.org mailing list

Re: CFT: BSD-licensed grep [Fwd: cvs commit: ports/textproc/bsdgrep Makefile distinfo]

2008-06-16 Thread Dag-Erling Smørgrav
Doug Barton [EMAIL PROTECTED] writes: Andrey Chernov [EMAIL PROTECTED] writes: Please note that BSD grep is not localized (and can't be per design) and works only with standard C locale. It may not affect ports system processing but shurely affects real texts handling. That is very

Re: CFT: BSD-licensed grep [Fwd: cvs commit: ports/textproc/bsdgrep Makefile distinfo]

2008-06-16 Thread Doug Barton
Dag-Erling Smørgrav wrote: Doug Barton [EMAIL PROTECTED] writes: Andrey Chernov [EMAIL PROTECTED] writes: Please note that BSD grep is not localized (and can't be per design) and works only with standard C locale. It may not affect ports system processing but shurely affects real texts

Re: CFT: BSD-licensed grep [Fwd: cvs commit: ports/textproc/bsdgrep Makefile distinfo]

2008-06-16 Thread Andrey Chernov
On Mon, Jun 16, 2008 at 02:36:23PM +0200, Dag-Erling Sm??rgrav wrote: Please note that BSD grep is not localized (and can't be per design) and works only with standard C locale. It may not affect ports system processing but shurely affects real texts handling. That is very troubling. In

Re: CFT: BSD-licensed grep [Fwd: cvs commit: ports/textproc/bsdgrep Makefile distinfo]

2008-06-16 Thread Andrey Chernov
On Tue, Jun 17, 2008 at 04:22:25AM +0400, Andrey Chernov wrote: On Mon, Jun 16, 2008 at 02:36:23PM +0200, Dag-Erling Sm??rgrav wrote: Please note that BSD grep is not localized (and can't be per design) and works only with standard C locale. It may not affect ports system processing

Re: CFT: BSD-licensed grep [Fwd: cvs commit: ports/textproc/bsdgrep Makefile distinfo]

2008-06-16 Thread Andrey Chernov
On Tue, Jun 17, 2008 at 04:28:10AM +0400, Andrey Chernov wrote: BSD grep is even not bothering to call setlocale(). I can't say is it can be simple healed by adding that call, some test suite run is needed. Quick source inspection reveals that BSD grep operates with single bytes only (util.c)

Re: CFT: BSD-licensed grep [Fwd: cvs commit: ports/textproc/bsdgrep Makefile distinfo]

2008-06-15 Thread Doug Barton
I use the following construct in portmaster, where pdb=/var/db/pkg, origin is set to the origin of a given port, and ro_opd is usually empty, but can be another origin directory or the same one. To guarantee that you should get some kind of results you can test with origin=devel/gettext.

Re: CFT: BSD-licensed grep [Fwd: cvs commit: ports/textproc/bsdgrep Makefile distinfo]

2008-06-15 Thread Diomidis Spinellis
Doug Barton wrote: I use the following construct in portmaster, where pdb=/var/db/pkg, origin is set to the origin of a given port, and ro_opd is usually empty, but can be another origin directory or the same one. To guarantee that you should get some kind of results you can test with

Re: CFT: BSD-licensed grep [Fwd: cvs commit: ports/textproc/bsdgrep Makefile distinfo]

2008-06-15 Thread Gabor Kovesdan
Doug Barton escribió: I use the following construct in portmaster, where pdb=/var/db/pkg, origin is set to the origin of a given port, and ro_opd is usually empty, but can be another origin directory or the same one. To guarantee that you should get some kind of results you can test with

Re: CFT: BSD-licensed grep [Fwd: cvs commit: ports/textproc/bsdgrep Makefile distinfo]

2008-06-15 Thread Kövesdán Gábor
Diomidis Spinellis escribió: Doug Barton wrote: I use the following construct in portmaster, where pdb=/var/db/pkg, origin is set to the origin of a given port, and ro_opd is usually empty, but can be another origin directory or the same one. To guarantee that you should get some kind of

Re: CFT: BSD-licensed grep [Fwd: cvs commit: ports/textproc/bsdgrep Makefile distinfo]

2008-06-15 Thread Andrey Chernov
On Sun, Jun 15, 2008 at 09:17:01PM +0200, K?vesd?n G?bor wrote: Yes, of course, I haven't forgotten about your suggestion. First, I'd like to process the trivial errors, which come up like this one and make some tests myself. Then I'll think about this idea and ask portmgr to do an exp-run

Re: CFT: BSD-licensed grep [Fwd: cvs commit: ports/textproc/bsdgrep Makefile distinfo]

2008-06-15 Thread Garrett Cooper
On Sun, Jun 15, 2008 at 2:26 PM, Andrey Chernov [EMAIL PROTECTED] wrote: On Sun, Jun 15, 2008 at 09:17:01PM +0200, K?vesd?n G?bor wrote: Yes, of course, I haven't forgotten about your suggestion. First, I'd like to process the trivial errors, which come up like this one and make some tests

Re: CFT: BSD-licensed grep [Fwd: cvs commit: ports/textproc/bsdgrep Makefile distinfo]

2008-06-15 Thread Doug Barton
Andrey Chernov wrote: On Sun, Jun 15, 2008 at 09:17:01PM +0200, K?vesd?n G?bor wrote: Yes, of course, I haven't forgotten about your suggestion. First, I'd like to process the trivial errors, which come up like this one and make some tests myself. Then I'll think about this idea and ask portmgr

CFT: BSD-licensed grep [Fwd: cvs commit: ports/textproc/bsdgrep Makefile distinfo]

2008-06-14 Thread Gabor Kovesdan
Hello All, Today I've basically terminated te feature-completion of the BSD-licensed grep from OpenBSD. It means, that I've accomplished the following tasks: - Implement --label - Implement --null - Implement --color / --colour - Implement -D / --devices - Implement -H / --with-filename -