Re: [PR] Gapfill: Add support for lowercase datatypes [pinot]

via GitHub Mon, 02 Oct 2023 10:43:18 -0700


zhtaoxiang commented on code in PR #11722:
URL: https://github.com/apache/pinot/pull/11722#discussion_r1342978255



##########
pinot-core/src/main/java/org/apache/pinot/core/query/reduce/BaseGapfillProcessor.java:
##########
@@ -226,4 +231,16 @@ protected List<Object[]> 
gapFillAndAggregate(List<Object[]> rows, DataSchema dat
       DataSchema resultTableSchema) {
     throw new UnsupportedOperationException("Not supported");
   }
+
+  protected String caseInsensitiveTypeString(String columnName) {

Review Comment:
   (I haven't check the details of the logic, so I may be wrong):
   the logic will also replace `INT_column` to `int_column` (or similar for 
other types) if the parameter contains such string, is this intended?



##########
pinot-core/src/main/java/org/apache/pinot/core/query/reduce/BaseGapfillProcessor.java:
##########
@@ -226,4 +231,16 @@ protected List<Object[]> 
gapFillAndAggregate(List<Object[]> rows, DataSchema dat
       DataSchema resultTableSchema) {
     throw new UnsupportedOperationException("Not supported");
   }
+
+  protected String caseInsensitiveTypeString(String columnName) {
+    String dataTypePattern = "(BOOLEAN|INT|LONG|FLOAT|DOUBLE|STRING)";
+    Matcher matcher = Pattern.compile(dataTypePattern, 
Pattern.CASE_INSENSITIVE).matcher(columnName);
+    StringBuffer result = new StringBuffer();

Review Comment:
   Do we need to make the code thread-safe? If not, `StringBuilder` will be 
faster



##########
pinot-core/src/main/java/org/apache/pinot/core/query/reduce/BaseGapfillProcessor.java:
##########
@@ -134,6 +136,9 @@ protected void replaceColumnNameWithAlias(DataSchema 
dataSchema) {
     for (int i = 0; i < dataSchema.getColumnNames().length; i++) {
       if (columnNameToAliasMap.containsKey(dataSchema.getColumnNames()[i])) {
         dataSchema.getColumnNames()[i] = 
columnNameToAliasMap.get(dataSchema.getColumnNames()[i]);
+      } else if 
(columnNameToAliasMap.containsKey(caseInsensitiveTypeString(dataSchema.getColumnNames()[i])))
 {
+        dataSchema.getColumnNames()[i] =
+                
columnNameToAliasMap.get(caseInsensitiveTypeString(dataSchema.getColumnNames()[i]));

Review Comment:
   1. With this code, I think we can remove the first if statement to simplify 
the code(even though it always runs the `caseInsensitiveTypeString` code and 
may be a bit slower), what do you think?
   ```
   if (columnNameToAliasMap.containsKey(dataSchema.getColumnNames()[i])) {
       dataSchema.getColumnNames()[i] = 
columnNameToAliasMap.get(dataSchema.getColumnNames()[i]);
   }
   ```
   2. We can define a variable `lowerCaseColumnName = 
caseInsensitiveTypeString(columnName)` to avoid execute the same code twice.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] Gapfill: Add support for lowercase datatypes [pinot]

Reply via email to