keith-turner commented on issue #88: URL: https://github.com/apache/accumulo-access/issues/88#issuecomment-3583437196
> A simpler description means there's less to read, less to implement, and probably better correctness in other implementations. That sounds great, I would support simplifying the ABNF to state an authorization is utf8 without defining utf8. > It looks to me like the access expression parser ColumnVisibilityParser in Accumulo still assumes that the byte sequences it's parsing represent valid UTF-8. I looked into this, the code is happy with non utf8 data. However there are multiple javadocs that mention UTF-8 is expected but that its not checked for like [here](https://github.com/apache/accumulo/blob/b5461fdbf8920209635a2a650d21c902d62bbd7d/core/src/main/java/org/apache/accumulo/core/security/Authorizations.java#L143-L149) > If an additional mode is under consideration Not sure if that is the best solution, do need to resolve this problem in some way though and bring the spec and code into agreement while considering compatibility with existing data. I wrote the following test to experiment using non utf8 data in Accumulo (added this to VisibilityEvaluatorTest locally). Ran this test in 2.1.5-SNAPSHOT. This shows the code in Accumulo is happy and works w/ non utf8 data. ```java @Test public void testNonUtf8() throws VisibilityParseException { byte[] auth1 = new byte[]{(byte) 0xf0,'a'}; byte[] auth2 = new byte[]{(byte) 0xe0,'c'}; byte[] auth3 = new byte[]{(byte) 0xb0,'d'}; var decoder = StandardCharsets.UTF_8.newDecoder().onUnmappableCharacter(CodingErrorAction.REPORT).onMalformedInput(CodingErrorAction.REPORT); // verify the auths are not utf8 assertThrows(CharacterCodingException.class, ()->decoder.decode(ByteBuffer.wrap(auth1))); assertThrows(CharacterCodingException.class, ()->decoder.decode(ByteBuffer.wrap(auth2))); assertThrows(CharacterCodingException.class, ()->decoder.decode(ByteBuffer.wrap(auth3))); VisibilityEvaluator evaluator = new VisibilityEvaluator(new Authorizations(List.of(ByteBuffer.wrap(auth1), ByteBuffer.wrap(auth2)))); // Construct an expression avoiding passing data through String as it may corrupt/transform the data ByteArrayOutputStream out = new ByteArrayOutputStream(); out.writeBytes(quote(auth1)); out.write('&'); out.writeBytes(quote(auth2)); byte[] exp1 = out.toByteArray(); for(int i =0; i<exp1.length;i++){ System.out.printf("%d %x %s\n", i, exp1[i], (char)exp1[i]); } out = new ByteArrayOutputStream(); out.writeBytes(quote(auth1)); out.write('&'); out.writeBytes(quote(auth3)); byte[] exp2 = out.toByteArray(); assertFalse(evaluator.evaluate(new ColumnVisibility(exp2))); } ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
