I use Spark to take an old table, clean it up to create an RDD of cleaned data. What I’d like to do is write all of the data to a new table in HBase, then rename the table to the old name. If possible it could be done by changing an alias to point to the new table as long as all external code uses the alias, or by a 2 table rename operation. But I don’t see how to do this for HBase. I am dealing with a lot of data so don’t want to do table modifications with deletes and upserts, this would be incredibly slow. Furthermore I don’t want to disable the table for more than a tiny span of time.
Is it possible to have 2 tables and rename both in an atomic action, or change some alias to point to the new table in an atomic action. If not what is the quickest way to achieve this to minimize time disabled.