stevenzwu commented on code in PR #16160: URL: https://github.com/apache/iceberg/pull/16160#discussion_r3358891990
########## api/src/main/java/org/apache/iceberg/catalog/CatalogObjectIdentifier.java: ########## @@ -0,0 +1,96 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ +package org.apache.iceberg.catalog; + +import java.util.Arrays; +import java.util.function.Predicate; +import java.util.regex.Pattern; +import org.apache.iceberg.relocated.com.google.common.base.Joiner; +import org.apache.iceberg.relocated.com.google.common.base.Preconditions; + +/** + * A reference to a catalog object as an ordered list of hierarchical levels (for example, a table, + * view, or namespace). The kind of object is determined by context — the endpoint or a companion + * type discriminator — not by the identifier structure alone. + * + * <p>Mirrors {@link Namespace} structurally; the distinct name signals "any object within a + * catalog" and avoids confusion with a future top-level catalog name. + */ +public class CatalogObjectIdentifier { + private static final Joiner DOT = Joiner.on('.'); + private static final Predicate<String> CONTAINS_NULL_CHARACTER = Review Comment: This mirrors the null-byte check on `Namespace` levels (added in [#3938](https://github.com/apache/iceberg/pull/3938)) — the class Javadoc explicitly notes that this type mirrors `Namespace` structurally. The motivation from #3938 still applies: a `\u0000` inside a level corrupts encodings that use that byte as a delimiter (Hive metastore, some catalog backends), so failing fast at construction is the safe default. `TableIdentifier` does not have the check, but `Namespace` is the closer precedent for this type. ########## api/src/main/java/org/apache/iceberg/catalog/CatalogObjectIdentifier.java: ########## @@ -0,0 +1,96 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ +package org.apache.iceberg.catalog; + +import java.util.Arrays; +import java.util.function.Predicate; +import java.util.regex.Pattern; +import org.apache.iceberg.relocated.com.google.common.base.Joiner; +import org.apache.iceberg.relocated.com.google.common.base.Preconditions; + +/** + * A reference to a catalog object as an ordered list of hierarchical levels (for example, a table, + * view, or namespace). The kind of object is determined by context — the endpoint or a companion + * type discriminator — not by the identifier structure alone. + * + * <p>Mirrors {@link Namespace} structurally; the distinct name signals "any object within a + * catalog" and avoids confusion with a future top-level catalog name. + */ +public class CatalogObjectIdentifier { + private static final Joiner DOT = Joiner.on('.'); + private static final Predicate<String> CONTAINS_NULL_CHARACTER = + Pattern.compile("\u0000", Pattern.UNICODE_CHARACTER_CLASS).asPredicate(); + + public static CatalogObjectIdentifier of(String... levels) { + Preconditions.checkArgument( + null != levels, "Cannot create catalog object identifier from null array"); + + for (String level : levels) { + Preconditions.checkNotNull( + level, "Cannot create a catalog object identifier with a null level"); + Preconditions.checkArgument( + !CONTAINS_NULL_CHARACTER.test(level), + "Cannot create a catalog object identifier with the null-byte character"); + } Review Comment: Will move the per-level validation into the private constructor. The constructor as the universal invariant point is the more defensive convention — the rule survives any future construction path (additional factories, test-only construction, reflective deserializers). Was following `Namespace`'s factory-only style earlier, but happy to flip and could align `Namespace` in a follow-up if worth the consistency. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
