Hello, I’m using arrow c++ as storage and computing structure of my own project, which is a database based on PostgresSQL.
But when computing with a batch containing constant value column, the constant value has to be expanded to an array to store into batch, which is waste of time and memory. Arrow::scalar can be used as parameter for arrow functions, but cannot represent a column in batch. So if we want to compute a batch containing constant value column, the expansion of value is inevitable. This occurs mainly before batch serialization, and functions like FilterBatch. A constant-type array may solve this problem. It looks like an arrow array, but only stores single constant value and number of rows. In functions like Arrow::Sum, the result can even be computed by multiplication. Another solution is allowing batch containing Arrow::Scalar. All this is just a suggestion from an Arrow user. I’m not sure that whether it is helpful for Arrow project. Thanks, Song