I inserted some sample data and was able to use a regex to find values that matched "itemId: " followed by a valid UUID, and using negative look ahead, "itemId: " followed by anything other than a valid UUID.
See below: root@uno t1> scan a b:c [] itemId: 11aa22bb-d33d-e44e-f55f-6677889900cc b b:c [] itemId: 11aa22bbd33d2abcav34-11d25d334455 c b:c [] nope: 11aa22bbd33d2abcav34-11d25d334455 d b:c [] nope: 11aa22bb-d33d-2abc-av34-11d25d334455 root@uno t1> egrep '.*itemId: (?:[a-f0-9]{8}(?:-[a-f0-9]{4}){4}[a-f0-9]{8}).*' a b:c [] itemId: 11aa22bb-d33d-e44e-f55f-6677889900cc root@uno t1> egrep '.*itemId: (?\![a-f0-9]{8}(?:-[a-f0-9]{4}){4}[a-f0-9]{8}).*' egrep '.*itemId: (?![a-f0-9]{8}(?:-[a-f0-9]{4}){4}[a-f0-9]{8}).*' b b:c [] itemId: 11aa22bbd33d2abcav34-11d25d334455 On Thu, Mar 19, 2020 at 8:17 AM Donald Mackert <macke...@gmail.com> wrote: > > Christopher, > > BLUF: We want to find all rows for a given column that do not have a > valid UUID. > > Here is an example of what we do not want to match, which is a UUID > in the itemId: 11aa22bb-d33d-2abc-av34-11d25d334455 > > What looking for a column represented by the example column and the > first eight characters of the UUID followed by a dash - > > Don > > > On Wed, Mar 18, 2020 at 11:25 PM Christopher <ctubb...@apache.org> wrote: >> >> The shell command, egrep, uses the RegExFilter[1] underneath. It >> supports Java regular expressions, which does support negative look >> ahead. So, it should be possible. >> >> However, it is possible there's some quoting issues... the shell >> itself uses backslash to escape, but it also uses JLine to parse >> output, and JLine might treat the exclamation point specially, so it >> might need to be escaped twice. However, this is just a guess. >> >> I would recommend trying to eliminate the shell variable, and scan >> using the Java API directly to test. >> >> If you can supply some examples on what you want to match, and those >> you don't want to match, I could probably try it myself to see if I >> can come up with a solution. >> >> [1]: >> https://github.com/apache/accumulo/blob/master/core/src/main/java/org/apache/accumulo/core/iterators/user/RegExFilter.java >> >> On Wed, Mar 18, 2020 at 7:22 PM Donald Mackert <macke...@gmail.com> wrote: >> > >> > Hello, >> > >> > Does the accumulo egrep command support regex negative look ahead? >> > >> > We are trying to find all rows that do not have a UUID pattern using >> > the following sample command >> > >> > egrep -c column ^\\{\"value\"\\:\\{\"itemId\"\\:\"((?\![0-9a-f]{8}-).)*$ >> > >> > The following egrep returns all rows that match the pattern >> > >> > egrep -c column ^\\{\"value\"\\:\\{\"itemId\"\\:\"([0-9a-f]{8}-).*$ >> > >> > Thank you, >> > >> > Don