Thanks a lot for your reply. I will bypass constant array now, and hope to use constant array in the future.
Song > 2022年5月7日 上午2:30,Weston Pace <weston.p...@gmail.com> 写道: > > Hi Song, > > Wes proposed a couple of different array types a few months ago in > [1]. These were documented in [2]. In this proposal a constant array > type was suggested in addition to a run-length encoded array type. > During the discussion it was suggested that a constant array might > just be a special case of a run-length encoded array. So there has > been some discussion about adding support for this. However, these > ideas have not been implemented yet and I'm not aware of any PRs so it > can be difficult to know if/when something may happen. > > In the present moment you might be able to use > arrow::compute::ExecBatch which is what we use in the streaming > execution engine to bypass this problem. An ExecBatch is a vector of > datums and so each column could either be a scalar or an array. The > batch itself has a length so if a batch with length 50 has a scalar > column then that implies a constant array of 50 items. However, this > does add complication to the logic (constantly needing to check if a > column is a scalar or an array) and I do hope the RLE array is added > as it can simplify a lot of this. > > -Weston > > [1] https://lists.apache.org/thread/49qzofswg1r5z7zh39pjvd1m2ggz2kdq > [2] > https://docs.google.com/document/d/12aZi8Inez9L_JCtZ6gi2XDbQpCsHICNy9_EUxj4ILeE/edit#heading=h.j2x776n0ymmp > > On Thu, May 5, 2022 at 4:28 PM Dongxiao Song <songdongx...@hashdata.cn> wrote: >> >> Hello, >> >> I’m using arrow c++ as storage and computing structure of my own project, >> which is a database based on PostgresSQL. >> >> But when computing with a batch containing constant value column, the >> constant >> value has to be expanded to an array to store into batch, which is waste of >> time >> and memory. >> >> Arrow::scalar can be used as parameter for arrow functions, but cannot >> represent >> a column in batch. So if we want to compute a batch containing constant >> value column, >> the expansion of value is inevitable. >> >> This occurs mainly before batch serialization, and functions like >> FilterBatch. >> >> A constant-type array may solve this problem. It looks like an arrow array, >> but only stores single constant value and number of rows. In functions like >> Arrow::Sum, the result can even be computed by multiplication. >> >> Another solution is allowing batch containing Arrow::Scalar. >> >> All this is just a suggestion from an Arrow user. I’m not sure that whether >> it is helpful >> for Arrow project. >> >> Thanks, >> Song >