ctubbsii commented on issue #2322:
URL: https://github.com/apache/accumulo/issues/2322#issuecomment-1136765560

   This seems related to, or perhaps a duplicate of, #608
   
   Should we close one of them?
   
   Also, I was thinking. The mapping of table IDs, which are BigInteger values, 
to their serialized form in the metadata table is not a monotonic function (it 
doesn't preserve order). So, when a new table is created, there's a good chance 
it's metadata is injected in the middle of the metadata table somewhere, and 
not at the end of the table in the last tablet. This might temporarily reduce 
initial hotspotting for new table's being created, but it could also be related 
to some potential garbage collection bugs if we don't account for it properly.
   
   Interestingly, the avoidance of hotspotting would only occur for the first 
few tables created, when the most significant digit is still changing 
frequently. We go back to hotspotting again quickly after a bunch of tables are 
created. This hotspotting is probably not substantial, but if we want to avoid 
it, that could be a separate ticket... to try to generate tableIds that are 
more spread out. However, if it turns out that inserting new table metadata 
(I'm particularly concerned about cloned tables, which have duplicate file 
references to existing tables) into the middle of a candidate garbage 
collection scan is the source of any bugs, we may want to just accept the 
hotspotting, and generate table IDs that are always appended to the end of the 
table, so their metadata is never inserted into the middle of a candidate scan.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to