kgyrtkirk commented on code in PR #17084:
URL: https://github.com/apache/druid/pull/17084#discussion_r1762536527
##########
processing/src/main/java/org/apache/druid/segment/VirtualColumns.java:
##########
@@ -172,10 +174,28 @@ public VirtualColumn getVirtualColumn(String columnName)
if (vc != null) {
return vc;
}
+ if (hasNoDotColumns) {
+ return null;
+ }
final String baseColumnName = splitColumnName(columnName).lhs;
return withDotSupport.get(baseColumnName);
}
+ /**
+ * Check if a virtual column is already defined which is the same as some
other virtual column, ignoring output name,
+ * returning that virtual column if it exists, or null if there is no
equivalent virtual column.
+ */
+ @Nullable
+ public VirtualColumn findEquivalent(VirtualColumn virtualColumn)
Review Comment:
why there is a need to expose this directly?
wouldn't it be better to make it the internal detail/contract of
`VirtualColumns` that any instance will make sure to reuse existing `VC`
columns?
##########
processing/src/main/java/org/apache/druid/segment/VirtualColumn.java:
##########
@@ -336,4 +336,12 @@ default ColumnIndexSupplier getIndexSupplier(
{
return NoIndexesColumnIndexSupplier.getInstance();
}
+
+ /**
+ * Check if a virtual column is the same as some other virtual column,
ignoring output name.
+ */
+ default boolean isEquivalent(VirtualColumn other)
+ {
+ return this.equals(other);
+ }
Review Comment:
I don't think this is a good move; you are adding a method which will resort
to object level compare (as default implementation).
why not make sure to implement hashcode + equals?
another pro of using equals is that it has the same signature ; and its a
base method
##########
processing/src/main/java/org/apache/druid/segment/VirtualColumns.java:
##########
@@ -172,10 +174,28 @@ public VirtualColumn getVirtualColumn(String columnName)
if (vc != null) {
return vc;
}
+ if (hasNoDotColumns) {
+ return null;
+ }
final String baseColumnName = splitColumnName(columnName).lhs;
return withDotSupport.get(baseColumnName);
}
+ /**
+ * Check if a virtual column is already defined which is the same as some
other virtual column, ignoring output name,
+ * returning that virtual column if it exists, or null if there is no
equivalent virtual column.
+ */
+ @Nullable
+ public VirtualColumn findEquivalent(VirtualColumn virtualColumn)
+ {
+ for (VirtualColumn vc : virtualColumns) {
+ if (vc.isEquivalent(virtualColumn)) {
Review Comment:
wouldn't this this for loop will cost `O(n^2)` if it gets run for all VC-s
in the query;
I think another pro of making it an internal contract/detail could enable
the use of a `Set` so that it doesn't involve much cost
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]