keith-turner commented on issue #5254:
URL: https://github.com/apache/accumulo/issues/5254#issuecomment-2651539056

   > @keith-turner - I was thinking that colsToRead was being used because I 
updated 
[this](https://github.com/apache/accumulo/blob/ec779446cfd926ae10bb0a4362f1ed65ab76925d/server/base/src/main/java/org/apache/accumulo/server/metadata/iterators/TabletMetadataCheckIterator.java#L105)
 line to pass it to TabletMetadata.convertRow() but looking again all it does 
is set the columns which were fetched and doesn't do any filtering.
   
   Yeah that does not do any filtering.  That info being passed to convertRow 
is only for validating that data requested from the TabletMetadata object was 
actually fetched by the scan.  
   
   > What is the best way to filter in this case? Should we create another 
iterator to wrap the source to skip columns or make a change to 
TabletMetadata.convertRow() to skip columns not specified as part of colsToRead 
etc?
   
   We could add something like the following.
   
   ```java
   class TableMetadata {
       enum ColumnType {
           public static Set<ByteSequence> resolveFamilies(Set<ColumnType> 
columns){
               // TODO  build a set of the families used by these column types
          }
       }
   }
   ```
   
   and then in TabletMetadataCheckIterator could make the following changes to 
have the source iterator filter on families.  This would not filter on 
qualifiers, but I thnik that is fine.  Would need an additional iterator to 
also filter on qualifiers.  Filtering only on families will still be correct 
and will narrow the data and cause locality groups to kick in.
   
   ```java
       if(colsToRead.equals(TabletMetadataCheck.ALL_COLUMNS)) {
         // want all columns so no need to filter on families
         source.seek(new Range(tabletRow), Set.of(), false);  
       } else {
         Set<ByteSequence> families = 
TabletMetadata.ColumnType.resolveFamilies(colsToRead);
         source.seek(new Range(tabletRow), families, true);  
       }
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to