+1 to remove them and then re-create them later - based on the state of
the AsterixDB storage world and cluster dynamics at that time. (I think
we'll have a better chance of getting them perfect if we re-do them then
- I don't remember that it took Yingy very long to do them the first
time, so I think the re-do path will beat the fix-up path if we want
them again.) As far as I know, since we don't document them, nobody is
using them - and I think the engineering cost of maintaining orphaned
code is too high (not worth it).
Any thoughts to the contrary?
Cheers,
Mike
On 11/20/17 10:52 AM, Murtadha Hubail wrote:
Hi all,
As you might be aware, we have a feature in AsterixDB to create temporary
datasets that differ from regular datasets in some ways such as:
Their existence is not persisted in metadata, but only in the CC metadata cache.
They don’t’ generate any transaction logs
Their files are deleted on NC restart.
If they are not accessed for some period of time, their metadata records are
removed from the CC metadata cache.
Temporary datasets were originally introduced to serve as a staging area
between AsterixDB and external systems such as Perglix. However, as the system
evolved over the years, the assumptions they were built on don’t hold anymore
and they could lead to undesired consequences such as leaking files after a CC
restart or inability to access the dataset files on a restarted NC. Therefore,
I’m proposing to remove the support for the current temporary datasets and we
may add the feature with a careful design at a later stage.
Any thoughts or concerns on removing temporary datasets?
Cheers,
Murtadha