navinko commented on PR #9529:
URL: https://github.com/apache/ozone/pull/9529#issuecomment-3675877806

   Thanks @adoroszlai for reviewing 
   
   > Can it be replaced in `processTableInParallel`, too?
   
   I see the processTableInParallel performs many background operation to make 
it work in parallel which requires key to be Comparable.
   
https://github.com/navinko/ozone/blob/1c817b0460f1e66e75aef0b8f5fde63c691f2a65/hadoop-ozone/recon/src/main/java/org/apache/hadoop/ozone/recon/tasks/util/ParallelTableIteratorOperation.java#L128
   
   String as Key are comparable but CodeBuffer not.
   
https://github.com/navinko/ozone/blob/1c817b0460f1e66e75aef0b8f5fde63c691f2a65/hadoop-hdds/common/src/main/java/org/apache/hadoop/hdds/utils/db/CodecBuffer.java#L53
   
   --> processTableSequentially 
   Works for non-string keys , the keys are just byte array some binary data 
and with TableIterator we are again iterating over these keys , no operations 
being performed like comparison of two binary row data , there is no range 
splitting. Hence CodecBuffer works , it is not implementing Comparable.
   
   Table<byte[], byte[]> table = omMetadataManager.getStore()
           .getTable(tableName, ByteArrayCodec.get(), ByteArrayCodec.get(), 
TableCache.CacheType.NO_CACHE);
   
   --> processTableInParallel  , The keys are string which are comparable  and 
under ParallelTableIteratorOperation.java we see there are operations like 
splitting the ranges for multiple worker and they do key comparison like 
key.compareTo(startKey) , since strings are comprable.
     
     Table<String, byte[]> table = omMetadataManager.getStore()
           .getTable(tableName, StringCodec.get(), ByteArrayCodec.get(), 
TableCache.CacheType.NO_CACHE);
              
   This seems to be the reason there is separate flow for sequential vs 
parallel processing and CodecBuffer should not be used existing implementation 
of processTableSequentially.   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to