I inserted some sample data and was able to use a regex to find values
that matched "itemId: " followed by a valid UUID, and using negative
look ahead, "itemId: " followed by anything other than a valid UUID.

See below:

root@uno t1> scan
a b:c []    itemId: 11aa22bb-d33d-e44e-f55f-6677889900cc
b b:c []    itemId: 11aa22bbd33d2abcav34-11d25d334455
c b:c []    nope: 11aa22bbd33d2abcav34-11d25d334455
d b:c []    nope: 11aa22bb-d33d-2abc-av34-11d25d334455
root@uno t1> egrep '.*itemId: (?:[a-f0-9]{8}(?:-[a-f0-9]{4}){4}[a-f0-9]{8}).*'
a b:c []    itemId: 11aa22bb-d33d-e44e-f55f-6677889900cc
root@uno t1> egrep '.*itemId: (?\![a-f0-9]{8}(?:-[a-f0-9]{4}){4}[a-f0-9]{8}).*'
egrep '.*itemId: (?![a-f0-9]{8}(?:-[a-f0-9]{4}){4}[a-f0-9]{8}).*'
b b:c []    itemId: 11aa22bbd33d2abcav34-11d25d334455


On Thu, Mar 19, 2020 at 8:17 AM Donald Mackert <macke...@gmail.com> wrote:
>
> Christopher,
>
>         BLUF: We want to find all rows for a given column that do not have a 
> valid UUID.
>
>         Here is an example of what we do not want to match, which is a UUID 
> in the itemId: 11aa22bb-d33d-2abc-av34-11d25d334455
>
>         What looking for a column represented by the example column and the 
> first eight characters of the UUID followed by a dash -
>
> Don
>
>
> On Wed, Mar 18, 2020 at 11:25 PM Christopher <ctubb...@apache.org> wrote:
>>
>> The shell command, egrep, uses the RegExFilter[1] underneath. It
>> supports Java regular expressions, which does support negative look
>> ahead. So, it should be possible.
>>
>> However, it is possible there's some quoting issues... the shell
>> itself uses backslash to escape, but it also uses JLine to parse
>> output, and JLine might treat the exclamation point specially, so it
>> might need to be escaped twice. However, this is just a guess.
>>
>> I would recommend trying to eliminate the shell variable, and scan
>> using the Java API directly to test.
>>
>> If you can supply some examples on what you want to match, and those
>> you don't want to match, I could probably try it myself to see if I
>> can come up with a solution.
>>
>> [1]: 
>> https://github.com/apache/accumulo/blob/master/core/src/main/java/org/apache/accumulo/core/iterators/user/RegExFilter.java
>>
>> On Wed, Mar 18, 2020 at 7:22 PM Donald Mackert <macke...@gmail.com> wrote:
>> >
>> > Hello,
>> >
>> >     Does the accumulo egrep command support regex negative look ahead?
>> >
>> >     We are trying to find all rows that do not have a UUID pattern using 
>> > the following sample command
>> >
>> >    egrep -c column ^\\{\"value\"\\:\\{\"itemId\"\\:\"((?\![0-9a-f]{8}-).)*$
>> >
>> >   The following egrep returns all rows that match the pattern
>> >
>> >   egrep -c column ^\\{\"value\"\\:\\{\"itemId\"\\:\"([0-9a-f]{8}-).*$
>> >
>> > Thank you,
>> >
>> > Don

Reply via email to