eric-maynard commented on code in PR #1686: URL: https://github.com/apache/polaris/pull/1686#discussion_r2124940156
########## extension/persistence/relational-jdbc/src/test/java/org/apache/polaris/extension/persistence/relational/jdbc/QueryGeneratorTest.java: ########## @@ -222,4 +222,29 @@ void testGenerateWhereClause_emptyMap() { Map<String, Object> whereClause = Collections.emptyMap(); assertEquals("", QueryGenerator.generateWhereClause(whereClause)); } + + @Test + void testGenerateOverlapQuery() { + String realmId = "realmId"; + int parentId = "polaris".hashCode(); + + assertEquals( + "SELECT entity_version, to_purge_timestamp, internal_properties, " + + "catalog_id, purge_timestamp, sub_type_code, create_timestamp, last_update_timestamp, " + + "parent_id, name, location, id, drop_timestamp, properties, grant_records_version, " + + "type_code FROM POLARIS_SCHEMA.ENTITIES WHERE realm_id = 'realmId' AND parent_id = " + + "-398224152 AND (1 = 2 OR location = '/' OR location = '/tmp/' OR location = '/tmp/location/' " + + "OR location LIKE '/tmp/location/%')", + QueryGenerator.generateOverlapQuery(realmId, parentId, "/tmp/location/")); + + assertEquals( + "SELECT entity_version, to_purge_timestamp, internal_properties, catalog_id, " + + "purge_timestamp, sub_type_code, create_timestamp, last_update_timestamp, parent_id, " + + "name, location, id, drop_timestamp, properties, grant_records_version, type_code " + + "FROM POLARIS_SCHEMA.ENTITIES WHERE realm_id = 'realmId' AND parent_id = -398224152 " + + "AND (1 = 2 OR location = 's3:/' OR location = 's3://' OR location = 's3://bucket/' OR " + + "location = 's3://bucket/tmp/' OR location = 's3://bucket/tmp/location/' OR location " Review Comment: > If we later support a storage technology with case-insensitive file paths, the existing implementation ... will have a hard time supporting it We will have to return `Optional.empty()` for the `hasOverlappingSiblings` method when that storage type is in use. So we'll have an easy time supporting this hypothetical storage technology, just not using the optimized check. However, the 4 storage technologies we currently actually support do work with the optimized check. > Conversely, if the admin user for some reason (accidentally?) configures the database to perform case-insensitive comparisons If this is a real configuration, then tons of things in our existing persistence implementation will break very severely, such as lookup by name. We rely on the database being case-sensitive already, so this PR isn't introducing any new dependency there. > Also, the parsing of the URI into path components (prefixes) should probably be generalized outside the Persistence layer Only one persistence implementation currently supports this check, so encapsulating the logic there seems wise. > In that regard, the presence of any entity with a location, but without an indexed location, should probably invalidate the optimized lookup result. This is not a state you can get into to have in the current implemenation. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@polaris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org