On 9/28/22 14:54, enh wrote: > heh... funny you should mention "." in the first position... > > $ echo "foo.jar" | old-toybox grep '\.jar' > foo.jar > $ echo "foo.jar" | new-toybox grep '\.jar' > $
Technically that's '\' in the first position, which is the problem. (That's both parsing wrong, and I had an optimization where it skips the first character checking fast pattern matches because obviously we already matched that to find the right bucket with the patterns for this character... EXCEPT if that first character is escaped then skipping one character of each side of the comparison will put the pattern traversal out of sync.) Meanwhile, a pattern '^$' should match ONLY empty lines, and the easiest way to handle that is just special case that one to the regex path. The fast path can still handle '^' and '$' which match every line: zero length pattern does a zero length match at each position, which -o has plumbing to filter out because * does that too since it's zero-or-more instances so a* is going to match for length zero at every place that does NOT have an 'a', ala: $ echo '' | grep 'z*' | wc -l 1 All '^' and '$' do is say the zero length match occurs just once at the start or end of every line, but the -o logic discards all zero length matches and the non-o logic just cares that there IS a match. UNLESS you're producing colored output, but the colored output mostly piggybacks on -o. There is a sort of terrible corner case though, which oddly enough makes this corner case visible! $ echo potato | grep --color=always '' | hd 00000000 70 6f 74 61 74 6f 0a |potato.| 00000007 $ echo potato | toybox grep --color=always '' | hd 00000000 1b 5b 6d 1b 5b 31 3b 33 31 6d 1b 5b 6d 70 6f 74 |.[m.[1;31m.[mpot| 00000010 61 74 6f 0a |ato.| 00000014 $ echo potato | toybox grep --color=always '$' | hd 00000000 1b 5b 6d 70 6f 74 61 74 6f 1b 5b 31 3b 33 31 6d |.[mpotato.[1;31m| 00000010 1b 5b 6d 0a |.[m.| 00000014 Yeah, I have a todo item to try to optimize the color escape generation for toybox but last time I sat down and looked at it I just didn't have the spoons. > that a simplified case based on the real-world build break caused by ... > merging zero jar files into one instead of all the jar files :-) Acknowledged. Try commit 193009855266? Also, I checked in the start of the sed/tar xform protocol stuff. It's not handling flags yet, but it's passing the data so it _can_ do so. (I didn't want to send you a tar/sed pair that had to be upgraded in tandem and then changed AGAIN later. Hopefully this is the usable protocol version...) Rob _______________________________________________ Toybox mailing list [email protected] http://lists.landley.net/listinfo.cgi/toybox-landley.net
