[GitHub] [hive] saihemanth-cloudera commented on a diff in pull request #4601: HIVE-27595: replaced cartesian-product table filtering by sort + binary search

via GitHub Mon, 14 Aug 2023 15:53:11 -0700


saihemanth-cloudera commented on code in PR #4601:
URL: https://github.com/apache/hive/pull/4601#discussion_r1294040144



##########
ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/HivePrivilegeObjectUtils.java:
##########
@@ -59,7 +59,74 @@ public static List<HivePrivilegeObject> 
getHivePrivDcObjects(List<String> dcList
       objs.add(new HivePrivilegeObject(HivePrivilegeObjectType.DATACONNECTOR, 
null, dcname));
     }
     return objs;
+  }
+
+
+  /**
+   * An index on a table list by dbName, tableName.
+   */
+  public abstract static class TableIndex<T> {
+    private final T[] index;
+    protected abstract String getDbName(T item);
+    protected abstract String getTableName(T item);
+    protected TableIndex(List<T> tables) {
+      if (tables != null) {
+        index = tables.toArray((T[]) new Object[0]);
+        // sort them by dbname, tblname
+        Arrays.sort(index, (left, right)->{
+          int cmp = getDbName(left).compareTo(getDbName(right));
+          return cmp != 0
+              ? cmp
+              : getTableName(left).compareTo(getTableName(right));
+        });
+      } else {
+        index = (T[]) new Object[0];
+      }
+    }
 
+    /**
+     * Lookup using dichotomy using order described by Name, tblName
+     * @param dbName the database name
+     * @param tableName the table name
+     * @return the table if found in the index, null otherwise
+     */
+    public T lookup(String dbName, String tableName) {
+      int low = 0;
+      int high = index.length - 1;
+      while (low <= high) {
+        int mid = (low + high) >>> 1;
+        T item = index[mid];
+        int cmp = getDbName(item).compareTo(dbName);
+        if (cmp == 0) {
+          cmp = getTableName(item).compareTo(tableName);
+        }
+        if (cmp < 0)
+          low = mid + 1;
+        else if (cmp > 0)
+          high = mid - 1;
+        else
+          return item; // key found
+      }
+      return null;  // key not found.
+    }
   }
 
+  /**
+   * Specialized index for HivePrivilegeObject.
+   */
+  public static class PrivilegeIndex extends TableIndex<HivePrivilegeObject> {

Review Comment:
   I have another suggestion regarding the implementation. I would like to use 
a Java HashMap for it,
   Key: <String> - 'DbName'.'TableName'
   Value: <Table> - Store the unfiltered table.
   So we store the unfiltered tables list(before converting it into 
HivePrivilegeObject and sending them for authorization) in the form of a map 
like the above. Once we get the filtered list, we simply look up the 
'dbName'.'tableName' in the map and return the table. 
   Space complexity: O(N) -- N is the size of the table list
   Run time complexity: O(1) -- HashMap lookup is constant order.
   
   I'm suggesting this because, theoretically even with the current 
implementation we are using up O(N) space. In practice, the space requirements 
of the current implementation is more efficient than what I suggested. For 
runtime complexity, we avoid NlogN look-up time if we go with the above 
suggestion.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org
For additional commands, e-mail: gitbox-h...@hive.apache.org

[GitHub] [hive] saihemanth-cloudera commented on a diff in pull request #4601: HIVE-27595: replaced cartesian-product table filtering by sort + binary search

Reply via email to