bug#41700: grep -v always exiting with 1 for empty file

2020-06-04 Thread Eric Blake
-l file -rw-r--r-- 1 eggert eggert 0 Jun 4 13:24 file have one or zero lines? Empty files have no lines. On 6/4/20 1:13 PM, Eric Blake wrote: The most intuitive behavior is that grep behaves as if the file included the trailing newline That's what grep does with files that end in a non-new

bug#41700: grep -v always exiting with 1 for empty file

2020-06-04 Thread Eric Blake
tch) actually displays a newline that was not present in the input, but there have also been historical grep that behave as if the file ended at the last newline, and ignore the trailing garbage even if it would have matched were a trailing newline present. -- Eric Blake, Principal Softw

bug#38503: Locale can cause incorrect number parsing in binary files

2019-12-05 Thread Eric Blake
h version of glibc is in use (as it was glibc 2.28 that tried to use RRI in more locales, although work is still not complete there - and the presence or absence of particular historical glibc regcomp bugs determines whether configure decides to use gnulib's version instead). -- Eric B

bug#38503: Locale can cause incorrect number parsing in binary files

2019-12-05 Thread Eric Blake
On 12/5/19 2:29 PM, Eric Blake wrote: tag 38503 notabug thanks On 12/5/19 12:30 PM, jan h wrote: grep 3.3 Note that the Rational Range Interpretation of ranges claims that [0-9] should have the expansion [012345689] in ALL locales; and more and more versions of GNU utilities are

bug#38503: Locale can cause incorrect number parsing in binary files

2019-12-05 Thread Eric Blake
sing this as a non-bug. We may reopen it if additional details show that your version of grep was supposed to be using RRI but failed to do so. And feel free to continue conversation, even if we don't reopen the bug. -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3226 Virtualization: qemu.org | libvirt.org

bug#34951: [PATCH] grep: a kwset matcher not work in a grep matcher

2019-03-23 Thread Eric Blake
ersal you want (where find or xargs is used to invoke plain 'grep' on the resulting files) rather than trying to convince us to patch 'grep -r' to have more flexibility. -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3226 Virtualization: qemu.org | libvirt.org signature.asc Description: OpenPGP digital signature

bug#34126: grep v. 3.1 – unexpected error message "grep: i.: No such file or directory"

2019-01-18 Thread Eric Blake
positional usage). The difference becomes apparent in constructs like: grep -e -fi. -i # greps stdin case-insensitively for "-fi." grep -- -fi. -i # greps file ./-i case-sensitively for "-fi." -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3226 Virtualization: qemu.org | libvirt.org signature.asc Description: OpenPGP digital signature

bug#34078: typo in gnu grep's manual ?

2019-01-14 Thread Eric Blake
ep --help' output combined with a template; running: git grep 'is one or' quickly locates the culprit file of doc/grep.in.1. -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3226 Virtualization: qemu.org | libvirt.org signature.asc Description: OpenPGP digital signature

bug#32750: [PATCH 2/2] dfa: optmization of alternation in NFA

2018-09-18 Thread Eric Blake
id -dfaoptimize (struct dfa *d) +dfautf8noss (struct dfa *d) Or even some strategic spelling, as in: dfa_utf8_no_ss -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org

bug#32704: Can grep search for a line feed and a null character at the same time?

2018-09-15 Thread Eric Blake
oded in UTF-16, it's easiest to convert it into UTF-8 for the grep: iconv -f UTF-16 -t UTF-8 < file | grep $modified_pattern \ | iconv -f UTF-8 -t UTF-16 -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org

bug#32704: Can grep search for a line feed and a null character at the same time?

2018-09-15 Thread Eric Blake
. But is it at least possible to find “\x0A\x00” with grep? If you bend the rules by throwing -P into the mix, yes :) -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org

bug#32704: Can grep search for a line feed and a null character at the same time?

2018-09-11 Thread Eric Blake
On 9/11/18 12:14 PM, Paul Eggert wrote: On 9/11/18 10:03 AM, Eric Blake wrote: maybe we really do have a bug - when -z is in effect, I'd expect NUL, rather than newline, to be the byte that separates separate patterns in the pattern argument You're right, I think it's a bu

bug#32704: Can grep search for a line feed and a null character at the same time?

2018-09-11 Thread Eric Blake
matches exactly one newline byte at the end of a NUL-separated record). That said, your EASIEST approach is to use iconv to recode your file out of UTF-16 (which is NOT conducive to multi-byte processing), into something friendlier like UTF-8, and then use grep on the converted file. -- Er

bug#32409: make check fails with glibc-2.28

2018-08-09 Thread Eric Blake
. -XFAIL_TESTS += backref-alt +# The backslash-alt test fails for glibc 2.27 and earlier. s/backslash/backref/ +# If you're using older glibc you can upgrade to glibc 2.28 or later, +# configure --with-included-regex, or ignore the test failure. endif -- Eric Blake, Principal Software Engineer Re

bug#30919: [grep: input file './x’ is also the output] too late!.. have a null file.

2018-03-23 Thread Eric Blake
he truncation). But even if we add a new option, it would take years before it reaches common distros, and would still be a GNU extension not present on other platforms, so you couldn't necessarily rely on it. -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org

bug#30326: grep not searching through a text file (thinking it binary)

2018-02-02 Thread Eric Blake
tentional, so I'm closing this as not a bug in the tracker. However, feel free to add further comments or questions to the thread. And perhaps we could tweak the grep diagnostics to clarify whether a file is binary because NUL bytes were encountered, vs. a file is binary because encodin

bug#30242: help

2018-01-24 Thread Eric Blake
decorate paradigm (grep -n decorates matches with 'line:' and followups with 'line-'; the sed then picks the followups and removes the decorations). As -A is already documented as a grep option, I'm closing this as not a bug in the database. However, feel free to followup with fur

bug#27783: display help instead of pattern search

2017-07-21 Thread Eric Blake
ail and any attachments are confidential and may be > privileged. Sorry, but your disclaimer is unenforceable on publicly-archived lists. Please consider using a personal address to avoid spamming us with your employer's legalese. -- Eric Blake, Principal Software Engineer Red Ha

bug#27666: [grep on GPFS filesystem] SEEK_HOLE problem

2017-07-20 Thread Eric Blake
k > with 'grep' (albeit more slowly), and is the documented way that SEEK_HOLE is > supposed to work on file systems that cannot support SEEK_HOLE directly. > -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org signature.asc Description: OpenPGP digital signature

bug#27666: [grep on GPFS filesystem] SEEK_HOLE problem

2017-07-18 Thread Eric Blake
obliged to report holes, but IS obliged to NOT report holes if a read() on that range will not see zeroes. I still think GPFS has a bug. -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org signature.asc Description: OpenPGP digital signature

bug#27666: [grep on GPFS filesystem] SEEK_HOLE problem

2017-07-12 Thread Eric Blake
te filesystem/kernel folks. SEEK_HOLE is only allowed to return a mid-file offset if reading the file at that point in time would read NUL bytes, and NUL bytes are indeed binary data. > It could take several seconds to save the entire file on the disk. Does running 'sync' prior to grep

bug#26832: bug on grep 3.0

2017-05-11 Thread Eric Blake
pectations of an environment feeding \r\n to grep without accounting for the \r no longer being silently ignored. -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org signature.asc Description: OpenPGP digital signature

bug#26832: bug on grep 3.0

2017-05-11 Thread Eric Blake
On 05/10/2017 12:38 PM, Eric Blake wrote: >> BTW, I realized follows. >> Incorrect outputs of grep are output only on command prompt of Windows. >> Correct results are output on console of cygwin. > > Then this is an issue in how cygwin programs handle their arguments when

bug#26832: bug on grep 3.0

2017-05-10 Thread Eric Blake
ion on the cygwin mailing list. As such, I'm taking the liberty to close this in the upstream database, as there's nothing we can do here to change behavior. -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org signatu

bug#26576: -v when used with -C

2017-04-20 Thread Eric Blake
ng output if any of the last three inputs were UGLY. But more complicated than I want to spend time on for the sake of this email. -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org signature.asc Description: OpenPGP digital signature

bug#26576: -v when used with -C

2017-04-20 Thread Eric Blake
On 04/20/2017 11:38 AM, 積丹尼 Dan Jacobson wrote: > Yes, if somebody ever adds this option perhaps call it --compliment. Except that you mean --complement (you are not praising the lines, but making an opposite selection of lines). -- Eric Blake, Principal Software Engineer Red Hat,

bug#26576: -v when used with -C

2017-04-20 Thread Eric Blake
text you want to poison), then in an END block only print out lines if the corresponding poison[] entry is not 1. Although I'll leave that as an exercise for the reader. -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org signature.asc Description: OpenPGP digital signature

bug#26576: -v when used with -C

2017-04-20 Thread Eric Blake
ing set" doesn't tell me what you really want. -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org signature.asc Description: OpenPGP digital signature

bug#26576: -v when used with -C

2017-04-20 Thread Eric Blake
tching 2 lines, then 2 lines tail context, then a hunk separator, then 2 lines head context, then 2 more matching lines. Therefore, I'm tagging this as not a bug. -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org signature.asc Description: OpenPGP digital signature

bug#26322: grep '*' VS grep -E '*'

2017-03-31 Thread Eric Blake
ut that translates to '.*' in both BRE and ERE syntax. At any rate, I don't see this as a bug, so I'm closing the instance in the bug-tracker, but feel free to reply with further comments or questions. -- Eric Blake eblake redhat com+1-919-301-3266 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature

bug#26205: Unhappy with deprecating GREP_OPTIONS

2017-03-21 Thread Eric Blake
rd, rather fast, if you try to special-case WHICH aspects of GREP_OPTIONS are safe, vs. just a blanket statement that GREP_OPTIONS is in general unsafe. -- Eric Blake eblake redhat com+1-919-301-3266 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature

bug#26005: bug in grep

2017-03-06 Thread Eric Blake
om shell globbing changing the parameters based on the contents of the current directory. As the problem is in your improper use of shell quoting, and not in grep, I'm closing this as not a bug. However, feel free to respond if you still have questions. -- Eric Blake eblake redhat com

bug#25707: [PATCH] grep: don't forcefully strip carriage returns

2017-02-16 Thread Eric Blake
optimal, and one considers its replacement with > a more clever code, it would be a good idea to ask the person(s) who > contributed the original code, in case there was some good reason for > doing it that way. Was that done in this case? If not, it should > have been. > -- Eric Blake eblake redhat com+1-919-301-3266 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature

bug#25707: [PATCH] grep: don't forcefully strip carriage returns

2017-02-15 Thread Eric Blake
On 02/14/2017 05:08 PM, Paul Eggert wrote: > On 02/13/2017 12:20 PM, Eric Blake wrote: >> undossify_input causes more problems than it >> solves. We should trust fopen("r") to do the right thing, rather than >> reinventing it ourselves. > > Yes, that makes s

bug#25707: [PATCH] grep: don't forcefully strip carriage returns

2017-02-15 Thread Eric Blake
On 02/14/2017 05:08 PM, Paul Eggert wrote: > On 02/13/2017 12:20 PM, Eric Blake wrote: >> undossify_input causes more problems than it >> solves. We should trust fopen("r") to do the right thing, rather than >> reinventing it ourselves. > > Yes, that makes s

bug#25707: [PATCH] grep: don't forcefully strip carriage returns

2017-02-13 Thread Eric Blake
On 02/13/2017 02:00 PM, Paul Eggert wrote: > On 02/13/2017 11:23 AM, Eric Blake wrote: >> the use of fopen("rt") actively >> breaks assumptions on a binary mount by silently corrupting any >> carriage returns that are supposed to be preserved. > > Surely it&

bug#25707: [PATCH] grep: don't forcefully strip carriage returns

2017-02-13 Thread Eric Blake
is possible to control whether a mount point is binary or text by default (using just "r"), the use of fopen("rt") actively breaks assumptions on a binary mount by silently corrupting any carriage returns that are supposed to be preserved. * src/grep.c (main): Never use fopen

bug#25048: --with-included-regex vs. e-acute piped into LC_ALL=fr_FR.iso88591 grep '[d-f]'

2016-11-28 Thread Eric Blake
SHOULD be adjusting more and more GNU tools to honor rational range behavior, at least as an option, even if that means that e-acute can never be matched to [d-f]. -- Eric Blake eblake redhat com+1-919-301-3266 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature

bug#24858: URGENT: Question about grep

2016-11-02 Thread Eric Blake
n is more than > compensated for by the reduced copying of the datastream: > > grep -E '^.{0,30}GTGTCA That searches up to 36 characters. If you want to limit it to just the first 30, you need '^.{0,24}GTGTCA', since the match will never occur later than the 24th cha

bug#24858: URGENT: Question about grep

2016-11-02 Thread Eric Blake
a match only in the first 30 characters by explicitly spelling out the fixed-width remainder of the line as an anchor: grep 'ACGTAC.*.\{50\}$' filename Sadly, the two example lines you printed were not the same length, so I don't think it helps for your case. -- Eric Blake

bug#24858: URGENT: Question about grep

2016-11-02 Thread Eric Blake
And I'll leave the awk program as an exercise for the reader. Therefore, I'm tagging this as not a bug. -- Eric Blake eblake redhat com+1-919-301-3266 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature

bug#24609: egrep '2\.?[0–9]' datafile does not work as expected

2016-10-04 Thread Eric Blake
;t read your mind to tell you where your command line went wrong (likely) or whether you have found a bug in egrep (less likely). Can you give some more information at what you are trying to accomplish? -- Eric Blake eblake redhat com+1-919-301-3266 Libvirt virtualization library http:/

bug#24025:

2016-07-19 Thread Eric Blake
line and write the same > script it works no problem, any help on this matter would be greatly > appreciated. You'll have to provide more details for us to be able to reproduce your situation; most likely it is not a bug in grep but a problem with your shell script, but I'll wait

bug#23983: [PATCH] grep: fix crash with a pattern of alternation of two same characters

2016-07-14 Thread Eric Blake
haracters is same, the trie has no child. memchr2() should already be handling the special case of the same character requested twice, without clients having to code around it. -- Eric Blake eblake redhat com+1-919-301-3266 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature

bug#23983: [PATCH] grep: fix crash with a pattern of alternation of two same characters

2016-07-14 Thread Eric Blake
k = link->llink ? link->llink : link->rlink ? link->rlink : link; > > - char const *mch = memchr2 (s, link->label, clink->label, n); so that you end up passing link->label to both parameters of memchr2() when there are no further children in the trie? -- Eric Bla

bug#23763: Bug report: Grep stops, if a text file contains a null character after 32768 bytes

2016-06-15 Thread Eric Blake
onstant NULL is a null pointer, as is the C expression '((void*)0)', although the null pointer need not have an all-zero bit representation in hardware). http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap03.html in particular 3.243-3.245 -- Eric Blake eblake redhat com+1-919-

bug#23763: Bug report: Grep stops, if a text file contains a null character after 32768 bytes

2016-06-13 Thread Eric Blake
hen -a is not used, I'm closing this as not a bug. But feel free to add further comments to this thread. -- Eric Blake eblake redhat com+1-919-301-3266 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature

bug#23361: 【Bug】bug report of GNU grep

2016-04-25 Thread Eric Blake
;) with regular expressions (where "." means "any character", and "*" means "zero or more repetitions of the previous regex construct, unless there is no previous regex construct, in which case it is well-defined for BRE but undefined for ERE"). At any rate, this is not a bug in grep, so I'm closing the bug report. But feel free to add further comments or questions on this thread. -- Eric Blake eblake redhat com+1-919-301-3266 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature

bug#23234: unexpected results with charset handling in GNU grep 2.23

2016-04-06 Thread Eric Blake
On 04/06/2016 05:04 PM, Bjoern Jacke wrote: > On 07.04.2016 00:33, Eric Blake wrote: >> That behavior complies with POSIX requirements. > > can you give a quote here? One thing which is not POSIX compliant is > that the diagnostic messages is given back on stdout. > http

bug#23234: unexpected results with charset handling in GNU grep 2.23

2016-04-06 Thread Eric Blake
On 04/06/2016 04:23 PM, Bjoern Jacke wrote: > On 06.04.2016 23:04, Eric Blake wrote: >> The change of treating encoding errors as binary files will NOT be >> reverted, but here, > > hmm ... think of log files: In log files you will usually find all kind > of encoding

bug#23234: unexpected results with charset handling in GNU grep 2.23

2016-04-06 Thread Eric Blake
rrors - all 256 byte values are characters). So this is indeed a bug to be fixed. -- Eric Blake eblake redhat com+1-919-301-3266 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature

bug#23227: Inconsistent behavior for --file=~/some-file

2016-04-05 Thread Eric Blake
On 04/05/2016 04:03 PM, Eric Blake wrote: > Tilde expansion in the shell is defined by POSIX to only happen if ~ > occurs as the first character of a word > > Since this behavior is baked into your shell, there's nothing grep can > do about it, so I'm closing this as n

bug#23227: Inconsistent behavior for --file=~/some-file

2016-04-05 Thread Eric Blake
; as two separate arguments (for all long options where the argument is not optional) is the easiest way to get ~ to the front of the word and thus have tilde expansion again. Since this behavior is baked into your shell, there's nothing grep can do about it, so I'm closing this as n

bug#23031: reporting write errors and handling SIGPIPE

2016-03-18 Thread Eric Blake
ght shell, and requires an intermediary C program). -- Eric Blake eblake redhat com+1-919-301-3266 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature

bug#22838: New 'Binary file' detection considered harmful

2016-02-29 Thread Eric Blake
tricky business; and any such change is at least 3 or 4 years down the road before it could be standardized in Issue 8 (right now, the focus is on Technical Corrigendum 2 for Issue 7). -- Eric Blake eblake redhat com+1-919-301-3266 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature

bug#22838: New 'Binary file' detection considered harmful

2016-02-29 Thread Eric Blake
d control characters inside. > > Since 2.21 I will now have to always specify -a or LC_ALL=C when > grepping my files. Yes, but then you are no longer relying on undefined behavior, and therefore have a leg to stand on if we break that behavior. -- Eric Blake eblake redhat com+1-

bug#22838: New 'Binary file' detection considered harmful

2016-02-29 Thread Eric Blake
On 02/29/2016 10:54 AM, Eric Blake wrote: > Encoding errors are not characters, but bytes. A line cannot contain > encoding errors. Therefore, a file with encoding errors is not a text file. Corollary - there exist files which are text files in some locales, but binary files in others (ba

bug#22838: New 'Binary file' detection considered harmful

2016-02-29 Thread Eric Blake
racter set in Portable Character Set > for a further explanation of the graphical representations of (abstract) > characters, as opposed to character encodings. > Encoding errors are not characters, but bytes. A line cannot contain encoding errors. Therefore, a file with e

bug#22838: New 'Binary file' detection considered harmful

2016-02-29 Thread Eric Blake
On 02/29/2016 10:14 AM, Marcello Perathoner wrote: > > A text file with encoding problems is a text file and not a binary file. Wrong, at least according to the POSIX definition of text file. A text file is one with no encoding errors. -- Eric Blake eblake redhat com+1-919-30

bug#22606: \? and \* behavior near the start of an expression disagree

2016-02-08 Thread Eric Blake
t ONE syntax, not 20 disparate flavors (of which the two most popular become POSIX BRE and ERE, with weird rules on what is valid where). But since this behavior is intentional and required by POSIX, I'm closing this as not a bug. Feel free to reply to the thread with further questions, though. -- Eric Blake eblake redhat com+1-919-301-3266 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature

bug#22071: incorrect behaviour for inverted matches with -l on empty files

2015-12-02 Thread Eric Blake
any match (and not the names of all files that had lines that didn't match). That is, -v and -L are different types of negation. $ grep -L '' empty empty $ grep -Lv '' empty empty The empty file has no matches to any patterns (whether or not the pattern is negated with -v)

bug#21827: Large file support

2015-11-05 Thread Eric Blake
that platform? > > The bug was introduced on 2015-06-04 and fixed on 2015-08-10 in glibc. I guess it depends on whether we think these flawed versions are common in the wild, or whether vendors have patched the major distros and only self-built glibc remains vulnerable, on whether it is wor

bug#21758: Translation Error in help info of "grep"

2015-10-26 Thread Eric Blake
8+0800\n" > "Last-Translator: Ji ZhengYu \n" > "Language-Team: Chinese (simplified) \n" > "Language: zh_CN\n" > > It would be great if you were to help improve those translations > in the next couple of days. I expect to release grep-2.22 very soon. -- Eric Blake eblake redhat com+1-919-301-3266 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature

bug#21621: Bug | grep | Running in loop

2015-10-05 Thread Eric Blake
is intentional, and since grep is waiting for you to do something (it is NOT burning 100% CPU during that wait), I'm marking this as not a bug. However, feel free to add more comments or questions on the topic. -- Eric Blake eblake redhat com+1-919-301-3266 Libvirt virtualization lib

bug#21554: requesting a new release

2015-09-24 Thread Eric Blake
you testing? Are you sure you have a matching gnulib submodule checkout to go along with the grep.git checkout you are attempting? We should try and fix the reason that is preventing you from compiling, whether or not there is a release; but that requires more details. -- Eric Blake eblake re

bug#20837: Code for --color=auto

2015-06-17 Thread Eric Blake
On 06/17/2015 09:04 PM, Paul J. Lucas wrote: > On Jun 17, 2015, at 12:52 PM, Eric Blake wrote: > >> But you don't always want color when piping. > > I know. I was asking specifically about what grep does (or should do) only > when the user supplies —color=auto. >

bug#20837: Code for --color=auto

2015-06-17 Thread Eric Blake
for --color=auto or --color=always; different programs have picked different defaults, but both sides have an argument for their choice and an issue with potential back-compat breakage to clients if they change the default] -- Eric Blake eblake redhat com+1-919-301-3266 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature

bug#20826: SEEK_HOLE not supported for ext4 for kernel < 3.1

2015-06-16 Thread Eric Blake
On 06/16/2015 07:06 AM, Pádraig Brady wrote: > On 16/06/15 13:47, Eric Blake wrote: >> [adding bug-gnulib] >> >> On 06/16/2015 06:28 AM, Johannes Meixner wrote: >>> >>>> From one of our (SUSE) kernel developers >>> I even got a proposal for a wor

bug#20826: SEEK_HOLE not supported for ext4 for kernel < 3.1

2015-06-16 Thread Eric Blake
ole */ > + } > + Sounds like we need to update the gnulib lseek module to work around bugs like this in the wild. -- Eric Blake eblake redhat com+1-919-301-3266 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature

bug#20638: BUG: standard & extended RE's don't find NUL's :-(

2015-05-24 Thread Eric Blake
y of treating NUL as a line terminator when -a is not in effect, thanks to the behavior being otherwise unspecified by POSIX. Try using 'grep -a' to force grep to treat the file as non-binary, in spite of the NULs. -- Eric Blake eblake redhat com+1-919-301-3266 Libvirt virt

bug#20605: Grep command

2015-05-18 Thread Eric Blake
utput in the below order > > 04A2 > 04A4 > 17F5 Which means that the first line that matched is one that matched your third alternative in the single pattern. > > Please let me if any additional switches to use. The only way to get output in a particular order is to do three

bug#20526: BUG: text file is detected as binary

2015-05-12 Thread Eric Blake
on't match for output, may still cause issues for some patterns (we've had cases of encoding errors causing 'grep -P' to go into an infinite loop, for example); but yes, as the behavior is undefined, we are still justified in adopting those heuristics, if someone is willing to co

bug#20526: BUG: text file is detected as binary

2015-05-07 Thread Eric Blake
y have a strong opinion on that ;) It would be fine if they would recode their file to use UTF-8, as that is pretty much a standard encoding these days. Latin-1 files are getting harder and harder to process, as more people move to multibyte UTF-8 locales. -- Eric Blake eblake redhat com

bug#20443: grep pattern is forward slash + star

2015-04-27 Thread Eric Blake
literal star rather than being the 0-or-more-operator. As such, this is not a bug, so I'm closing the report. However, feel free to ask further questions on the topic. -- Eric Blake eblake redhat com+1-919-301-3266 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature

bug#19738: How did [a-z] match é?

2015-01-31 Thread Eric Blake
ug, but do feel free to make further comments or questions. -- Eric Blake eblake redhat com+1-919-301-3266 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature

bug#19357: [PATCH] grep: fails to build grep on a machine which has no PCRE with --enable-gcc-warnings

2014-12-12 Thread Eric Blake
t make this a NEWS item. It is not a visible change in behavior to the end user, only to the build process. -- Eric Blake eblake redhat com+1-919-301-3266 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature

bug#19242: latest grep considers text files as binary

2014-12-05 Thread Eric Blake
On 12/05/2014 08:34 AM, Eric Blake wrote: > On 12/05/2014 02:58 AM, Thomas Wolff wrote: >> Paul Eggert wrote: >>>> the mentioned patches are apparently intended to fix issues in >>>> non-UTF-8 locales. >>> No, they're also needed for UTF-8 locales

bug#19242: latest grep considers text files as binary

2014-12-05 Thread Eric Blake
aving only NUL bytes and a non-empty file not ending in newline as the only reasons for a file to be marked binary. -- Eric Blake eblake redhat com+1-919-301-3266 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature

bug#19242: latest grep considers text files as binary

2014-12-05 Thread Eric Blake
the current locale, then that file is binary under the current locale, even though it may be text in a better locale. > The manual clearly says about -a: "Process a binary file as if it were > text" but partial content in a different text encoding does not make a >

bug#19094: Clean doc/{stamp-vti,version.texi}

2014-11-18 Thread Eric Blake
277b0 > +Subproject commit 8415b6792e53f9aa309caedda799f9d9f3dffc53 Also, your patch incorrectly changes the gnulib submodule. -- Eric Blake eblake redhat com+1-919-301-3266 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature

bug#18888: [platform-testers] new snapshot available: grep-2.20.72-d512

2014-10-31 Thread Eric Blake
#x27;s locale is defined, or because of a bug in grep. I also see: FAIL: surrogate-pair and in this case, cygwin DOES have weird handling of surrogate pairs (it chose wchar_t to be 2 bytes, which means surrogate pairs are required to represent all Unicode characters). Not sure if that will

bug#18863: grep -b bug

2014-10-28 Thread Eric Blake
ugh you should feel free to make further comments or even reopen this if you can demonstrate that a problem still exists. $ grep --version | head -n1 grep (GNU grep) 2.18 $ echo gnu is not unix | grep -b -o not 7:not -- Eric Blake eblake redhat com+1-919-301-3266 Libvirt virtualiz

bug#18817: \w is not synonym for [[:alnum:]] in UTF-8 locales

2014-10-24 Thread Eric Blake
so tell the lexer to process those > + strings, each minus its "already processed" '['. */ s/album/alnum/ -- Eric Blake eblake redhat com+1-919-301-3266 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature

bug#18777: [PATCH] dfa: improvement for checking of multibyte character boundary

2014-10-20 Thread Eric Blake
0-9 be single bytes, it does not forbid those characters from also being bytes embedded within multibyte characters). Is it worth extending your optimization to all five of the POSIX-guaranteed single byte characters? -- Eric Blake eblake redhat com+1-919-301-3266 Libvirt virtualizati

bug#18454: Improve performance when -P (PCRE) is used in UTF-8 locales

2014-09-17 Thread Eric Blake
rd > header files (glibc 2.5.1) and gnulib files. > It should be fairly easy for gnulib to fake SEEK_DATA/SEEK_HOLE (by treating all files as non-sparse). I guess we haven't needed to do that before now, because other GNU clients (such as coreutils and tar) of this have been doing condi

bug#18406: O_NOATIME patch

2014-09-11 Thread Eric Blake
avoid updating atime > on directories with grep -r, and it should be documented properly in > grep.texi and in 'grep --help' output and in NEWS (plus maybe write a > test case or two). Lots of work, but I like the idea. In fact, I proposed a similar idea for coreutils'

bug#18266: grep -P and invalid exits with error

2014-08-29 Thread Eric Blake
r a second invalid character s/cheks/checks/ > + /* Change invalid UTF-8 characters (according to pcre_exec) to > '\0' */ > + while (e == PCRE_ERROR_BADUTF8){ Space before { > +line_utf8_clean[sub[0]+invalid_pos] = '\0'; Spaces around + -- Eric Blake eblake redhat com+1-919-301-3266 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature

bug#18218: ERROR

2014-08-07 Thread Eric Blake
ed to do, but it should be fairly obvious from the error message that your problem was invalid usage on your part, and not a bug in grep. Feel free to ask more questions or give us more details about your situation, but at this point, I'm closing this bug so that it doesn't show up as an

bug#17981: [PATCH] maint.mk: less syntax-check noise when SIGPIPE is ignored

2014-07-11 Thread Eric Blake
On 07/11/2014 02:58 PM, Paul Eggert wrote: > On 07/08/2014 12:42 PM, Eric Blake wrote: >> It is unclear >> at this point whether POSIX would recommend that filter >> applications should_always_ exit with 0 status on pipe failure, >> or only do this for EPIPE write failu

bug#17981: [PATCH] maint.mk: less syntax-check noise when SIGPIPE is ignored

2014-07-09 Thread Eric Blake
early, but at least it cuts down on the noise. * top/maint.mk (_sc_header_without_use) (sc_require_config_h_first): Parse full list. Signed-off-by: Eric Blake --- I'll push this to gnulib to work around the issue. But it really begs the question - can sed and grep be ta

bug#17516: [PATCH] grep: no count newline at the head of a text buffer

2014-05-21 Thread Eric Blake
s again. > However, that new test made it so "make syntax-check" would fail. > I've suppressed that new failure via the attached: It's also possible to rewrite the line: grep -f in 'in' >out || fail=1 so as to avoid needing the suppression (I'm not su

bug#17471: On Solaris 10, grep snapshot apparently hit by bleeding-edge Autoconf bug

2014-05-12 Thread Eric Blake
ELL = > /bin/bash, which is what grep does with my test builds. In autoconf.git, there are zero hits for: git grep -F '0%/*' However, in grep.git, there is: src/egrep.sh:if test -x "${0%/*}/@grep@"; then src/egrep.sh: PATH=${0%/*}:$PATH The culprit is grep

bug#15199: UTF-16 surrogate pair handling in grep -i option

2014-04-28 Thread Eric Blake
'm lacking > context. Thanks. Threading-wise, you want to start at <http://bugs.gnu.org/15192> -- Eric Blake eblake redhat com+1-919-301-3266 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature

bug#17229: [PATCH 2/2] grep: speed-up by using memchr() in Boyer-Moore searching

2014-04-25 Thread Eric Blake
ions or any naive byte-by-byte comparisons. I suspect that using memchr2() for case-insensitive searches may allow you a speedup when searching for (the first byte of) two potential matches in the search string to the first character of a case-insensitive pattern. -- Eric Blake eblake redhat

bug#17080: [PATCH] egrep, fgrep: go back to shell scripts

2014-04-23 Thread Eric Blake
er program - you need to consider what happens when the user supplies program_transform_name as part of their configure arguments. This recent autoconf thread picked on grep as an example - so we'd better get the example right :) https://lists.gnu.org/archive/html/autoconf/2014-04/msg00011.htm

bug#17229: [PATCH 2/2] grep: speed-up by using memchr() in Boyer-Moore searching

2014-04-10 Thread Eric Blake
onsider feeding the glibc improvements back to gnulib). If the native memchr can be demonstrated to be slower on certain platforms in comparison to an implementation that we can write in straight C, then having gnulib supply the faster replacement is so much easier to maintain than to special-case

bug#17056: dfa.c patch for systems with no locale support

2014-03-27 Thread Eric Blake
rgument of setlocale() returning NULL as indicative of some rare error. But in main(), where you are the first call, the only thing that a NULL return implies is that your attempt to change the locale had no effect, so it remains at the locale it was before, which is the C locale since all prog

bug#17095: [PATCH] grep: proceed the `beg' pointer after exact matched in KWSet

2014-03-26 Thread Eric Blake
tch. > > patch.txt > > RnJvbSA3MTE1OGIyZmE3OTkzNzliZGNkYjZmNWFjMWI5M2Y3ODU2NmZiZDQ0IE1vbiBTZXAgMTcg > MDA6MDA6MDAgMjAwMQpGcm9tOiBOb3JpaGlybyBUYW5ha2EgPG5vcml0bmtAa2NuLm5lLmpwPgpE Your patch is once again illegible. -- Eric Blake eblake redhat com+1-919-301-3266 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature

bug#16911: [PATCH] grep: fix bugs with -i and titlecase

2014-03-01 Thread Eric Blake
LETTER J), > + 'grep -i Lj' now matches 'LJ' (U+01C7 LATIN CAPITAL LETTER LJ). Does it also match the lower case version? In other words, are all three cases of this character treated as equivalent? It might help to mention all three characters in the NEWS blurb. -- Eric

bug#16812: Eszett handling

2014-02-19 Thread Eric Blake
PROPER handling of locale-sensitive case rules, we'd need full Unicode rules that operate on words, rather than characters, which quickly gets out of scope of what we can do in POSIX regex. -- Eric Blake eblake redhat com+1-919-301-3266 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature

  1   2   3   >