[
https://issues.apache.org/jira/browse/AURORA-1861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Karthik Anantha Padmanabhan updated AURORA-1861:
------------------------------------------------
Comment: was deleted
(was: Not a blocker, but some of the scalding jobs and python scripts read from
the snapshot.
I don't think we use jobUpdate data anywhere in the scalding jobs - but just
bringing it up.
Before we entirely remove it we should provide a utility like you mentioned in
your first comment.
)
> Remove duplicate Snapshot fields for DB stores
> ----------------------------------------------
>
> Key: AURORA-1861
> URL: https://issues.apache.org/jira/browse/AURORA-1861
> Project: Aurora
> Issue Type: Task
> Components: Scheduler
> Reporter: David McLaughlin
> Assignee: David McLaughlin
>
> Currently we double-write any DB-backed stores into a Snapshot struct when
> creating a Snapshot. This inflates the size of the Snapshot, which is already
> a problem for large production clusters (see AURORA-74).
> Example for LockStore from
> https://github.com/apache/aurora/blob/master/src/main/java/org/apache/aurora/scheduler/storage/log/SnapshotStoreImpl.java:
> {code}
> new SnapshotField() {
> // It's important for locks to be replayed first, since there are
> relations that expect
> // references to be valid on insertion.
> @Override
> public void saveToSnapshot(MutableStoreProvider store, Snapshot
> snapshot) {
>
> snapshot.setLocks(ILock.toBuildersSet(store.getLockStore().fetchLocks()));
> }
> @Override
> public void restoreFromSnapshot(MutableStoreProvider store, Snapshot
> snapshot) {
> if (hasDbSnapshot(snapshot)) {
> LOG.info("Deferring lock restore to dbsnapshot");
> return;
> }
> store.getLockStore().deleteLocks();
> if (snapshot.isSetLocks()) {
> for (Lock lock : snapshot.getLocks()) {
> store.getLockStore().saveLock(ILock.build(lock));
> }
> }
> }
> },
> {code}
> The saveToSnapshot here is totally redundant as the entire H2 database is
> dumped into the dbScript field.
> Note: one major side-effect here is if anyone is trying to read these
> snapshots and utilize the data outside of Java - they'll lose the ability to
> process the data without being able to apply the DB script.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)