EdColeman commented on code in PR #4208:
URL: https://github.com/apache/accumulo/pull/4208#discussion_r1476753090
##########
server/manager/src/main/java/org/apache/accumulo/manager/tableOps/tableImport/PopulateMetadataTable.java:
##########
@@ -155,11 +166,7 @@ public Repo<Manager> call(long tid, Manager manager)
throws Exception {
if (m == null || !currentRow.equals(metadataRow)) {
if (m != null) {
- if (!sawTabletAvailability) {
- // add a default tablet availability
- TabletColumnFamily.AVAILABILITY_COLUMN.put(m,
-
TabletAvailabilityUtil.toValue(TabletAvailability.ONDEMAND));
- }
+ AVAILABILITY_COLUMN.put(m,
TabletAvailabilityUtil.toValue(initialAvailability));
Review Comment:
I think the distinction will come down to how availability is perceived. Is
it an inherent property of a table, or is it more closely related to the system
and environment. Things like splits, table properties seem closely tied to the
table and the data. It may be that availability does not have that same
affinity - it is more dependent on the system environment at any particular
point in time.
As an example, say you had historical data that was infrequently used except
for something like a month or yearly report - so when it comes time to run the
report(s) the availability is set to HOSTED while the reports are running,
otherwise ONDEMAND is fine for an occasional query. With that type of usage,
the availability is not tied to the table / data, but instead relates directly
to the system usage at that point in time.
Another example could be if you were hosting data to train some kind of
model. The data is only needed when running training for a specific model set
- otherwise it can be kept offline or ondemand.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]