keith-turner commented on issue #88:
URL: https://github.com/apache/accumulo-access/issues/88#issuecomment-3583437196

   >  A simpler description means there's less to read, less to implement, and 
probably better correctness in other implementations.
   
   That sounds great, I would support simplifying the ABNF to state an 
authorization is  utf8 without defining utf8.
   
   > It looks to me like the access expression parser ColumnVisibilityParser in 
Accumulo still assumes that the byte sequences it's parsing represent valid 
UTF-8. 
   
   I looked into this, the code is happy with non utf8 data.    However there 
are multiple javadocs that mention UTF-8 is expected but that its not checked 
for like 
[here](https://github.com/apache/accumulo/blob/b5461fdbf8920209635a2a650d21c902d62bbd7d/core/src/main/java/org/apache/accumulo/core/security/Authorizations.java#L143-L149)
   
   > If an additional mode is under consideration
   
   Not sure if that is the best solution, do need to resolve this problem in 
some way though and bring the spec and code into agreement while considering 
compatibility with existing data.
   
   I wrote the following test to experiment using non utf8 data in Accumulo 
(added this to VisibilityEvaluatorTest locally).  Ran this test in 
2.1.5-SNAPSHOT.  This shows the code in Accumulo is happy and works w/ non utf8 
data.   
   
   ```java
     @Test
     public void testNonUtf8() throws VisibilityParseException {
       byte[] auth1 = new byte[]{(byte) 0xf0,'a'};
       byte[] auth2 = new byte[]{(byte) 0xe0,'c'};
       byte[] auth3 = new byte[]{(byte) 0xb0,'d'};
   
       var decoder = 
StandardCharsets.UTF_8.newDecoder().onUnmappableCharacter(CodingErrorAction.REPORT).onMalformedInput(CodingErrorAction.REPORT);
       // verify the auths are not utf8
       assertThrows(CharacterCodingException.class, 
()->decoder.decode(ByteBuffer.wrap(auth1)));
       assertThrows(CharacterCodingException.class, 
()->decoder.decode(ByteBuffer.wrap(auth2)));
       assertThrows(CharacterCodingException.class, 
()->decoder.decode(ByteBuffer.wrap(auth3)));
   
       VisibilityEvaluator evaluator = new VisibilityEvaluator(new 
Authorizations(List.of(ByteBuffer.wrap(auth1), ByteBuffer.wrap(auth2))));
   
       // Construct an expression avoiding passing data through String as it 
may corrupt/transform the data
       ByteArrayOutputStream out = new ByteArrayOutputStream();
       out.writeBytes(quote(auth1));
       out.write('&');
       out.writeBytes(quote(auth2));
       byte[] exp1 = out.toByteArray();
   
       for(int i =0; i<exp1.length;i++){
         System.out.printf("%d %x %s\n", i, exp1[i], (char)exp1[i]);
       }
   
       out = new ByteArrayOutputStream();
       out.writeBytes(quote(auth1));
       out.write('&');
       out.writeBytes(quote(auth3));
       byte[] exp2 = out.toByteArray();
   
       assertFalse(evaluator.evaluate(new ColumnVisibility(exp2)));
     }
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to