Hi
I'm curious if there's a way of creating a zero copy union of two tables.
Currently, I'm augmenting an existing table by adding new columns (with
say moving averages - see snippet below). I do a pointer swap at the end
and release the memory of the old table (reset).
I wonder if it's more efficient if I created a new table with the new
columns and then created some kind of "zero-copy table union" of the new
table with the old table. Does that exist?
That said, perhaps the AddColumn method does re-use the existing table
memory location when it creates a "new Table".
arrow::Status AddMovingAverage(shared_ptr<arrow::Table>& table,
const std::string& colNameIn, int n,
const std::string& colNameOut) {
auto vals = table->GetColumnByName(colNameIn);
// calculate moving average vector
vector<double> ma
....
// convert vector to arrow array
shared_ptr<arrow::Array> ma_arr;
arrow::DoubleBuilder dbl_builder = arrow::DoubleBuilder();
ARROW_RETURN_NOT_OK(dbl_builder.AppendValues(ma.begin(), ma.end()));
ARROW_ASSIGN_OR_RAISE(ma_arr, dbl_builder.Finish());
// LOG(INFO) << ma_arr->ToString() << std::endl;
// add new column to table (need to convert to chunked array first)
auto f0 = arrow::field(colNameOut, arrow::float64());
auto ma_chunked_arr = std::make_shared<arrow::ChunkedArray>(ma_arr);
// Can this be done more efficiently with copying the original table to
// a new memory location?
ARROW_ASSIGN_OR_RAISE(auto new_table,
table->AddColumn(0, f0, ma_chunked_arr));
// swap pointer to new table and clean up
table.swap(new_table);
new_table.reset();
return arrow::Status::OK();
}