[GitHub] [accumulo] keith-turner opened a new issue, #3559: Explore storing FATE data in the metadata table

via GitHub Fri, 30 Jun 2023 10:48:07 -0700


keith-turner opened a new issue, #3559:
URL: https://github.com/apache/accumulo/issues/3559


   FATE operations store their data in zookeeper.  This is ok FATE is used for 
a relatively low volume of operations.  However in #3462 a use case arose where 
using FATE to commit compactions would have been nice however it was deemed 
that commiting compactions would be too high volume for zookeeper.  So as part 
of #3462 a simple ~refresh metadata section was created to ensure that tablet 
refresh happens after compaction commit even then process dies between commit 
and refresh.  This is similar in functionality to FATE, but much simpler and 
more specialized.  Using FATE for this use case would be nice as it would 
reduce custom code in Accumulo.
   
   The ~refresh entry introduced as part of #3462 is stored in ZK, root table, 
or metadata table depending on the tablet that is committing a compaction.  
Could FATE data be stored in a similar way, could the following be done?
   
    * For root table and system FATE operation stored the FATE data in ZK
    * For metadata table FATE operations store the FATE data in the root table
    * For user table FATE operations store the FATE data in the metadata table
   
   Currently many FATE operations are tightly coupled to table locks.  These 
would also need to be stored in the metadata table, not sure if these could be 
done using conditional mutations. That is an example of something that could 
cause problems for this strategy.
   
   The benefit of this change is it would open FATE up for more use cases and 
would place less memory and write pressure on zookeeper for current use cases.  
However if #3462 is the only use case that needs this, it may not be 
worthwhile.  
   
   Another possible use case that could benefit from this is from #3382 where 
system initiated splits are not executed as a FATE operation.  However system 
initiated splits would not normally cause the sustained high levels of metadata 
activity that compactions would.  System initiated splits could cause burst of 
metadata activity that would place a lot of write and memory pressure on ZK.  
For example if a table decides all of its tables need to split around the same 
time.
   
   Also wondering if this could help with making multiple manager possible by 
making it easier to partition the FATE operations.
   
   The purpose of this issue is to decided if this is workable and worthwhile.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [accumulo] keith-turner opened a new issue, #3559: Explore storing FATE data in the metadata table

Reply via email to