Thanks Weston. We are mostly interested in the total memory allocation/usage for an end-to-end "read-asofjoin-write" use case and will revert the table source node (since we don't want to load all data into memory).
On Mon, Jul 11, 2022 at 5:33 PM Weston Pace <weston.p...@gmail.com> wrote: > I suppose it depends on your goal. > > My earlier feedback was that doing a true scan is often detrimental > for benchmarking since I/O time can often dominate the results. Also, > to get the best scan results, you often spend a lot of time > micromanaging the file format / compression / file layout / etc. That > was why I had recommended going with a TableSourceNode if you were > build a benchmark to focus understanding of a single node. > > On the other hand, if your goal is understanding end-to-end query > times, then a table source node is probably not what you would start > with. > > One useful number, regardless of how you are inputting your data, is > the "total size of all data". You wouldn't get that from a memory > pool though. You could get that by calling the utilities in > src/arrow/util/byte_size.h on your table. This might give you > something to compare/contrast allocation of an individual node with. > > On Mon, Jul 11, 2022 at 2:04 PM Li Jin <ice.xell...@gmail.com> wrote: > > > > > TableSourceNode wouldn't need to allocate since it runs against memory > > that's already been allocated. > > Is the memory "that is already allocated" tracked in any allocators? For > an > > end to end benchmark of "scan - join - write" I think would make sense to > > include all arrow memory allocation (if that makes sense) > > > > On Mon, Jul 11, 2022 at 4:37 PM Weston Pace <weston.p...@gmail.com> > wrote: > > > > > > Is there anything else I'd need to change? > > > > > > Maybe try something like this: > > > > > > > https://github.com/westonpace/arrow/commit/15ac0d051136c585cda63297e48f17557808d898 > > > > > > > Beyond that, we should also expect to see some allocations from > > > TableSourceNode going through the logging memory pool, even if > AsOfJoinNode > > > was using the default memory pool instead of the Exec Plan's pool, but > I am > > > not seeing anything come through... > > > > > > TableSourceNode wouldn't need to allocate since it runs against memory > > > that's already been allocated. It might split input into smaller > > > batches but slicing tables / arrays is a zero-copy operation that does > > > not require allocating new buffers. > > > > > > On Mon, Jul 11, 2022 at 12:46 PM Ivan Chau <ivan.c...@twosigma.com> > wrote: > > > > > > > > Yeah this behavior is certainly a bit strange then. > > > > > > > > The only alteration I am making is changing the way we create the > > > Execution Context in the benchmark file. > > > > > > > > Something like: > > > > > > > > ``` > > > > auto logging_pool = LoggingMemoryPool(default_memory_pool()); > > > > ExecContext ctx(&logging_pool, ...); > > > > ``` > > > > > > > > Is there anything else I'd need to change? > > > > > > > > Beyond that, we should also expect to see some allocations from > > > TableSourceNode going through the logging memory pool, even if > AsOfJoinNode > > > was using the default memory pool instead of the Exec Plan's pool, but > I am > > > not seeing anything come through... > > > > > > > > -----Original Message----- > > > > From: Weston Pace <weston.p...@gmail.com> > > > > Sent: Monday, July 11, 2022 2:47 PM > > > > To: dev@arrow.apache.org > > > > Subject: Re: cpp Memory Pool Clarification > > > > > > > > Are you changing the default memory pool to a LoggingMemoryPool? > > > > Where are you doing this? For a benchmark I think you would need to > > > change the implementation in the benchmark file itself. > > > > > > > > Similarly, is AsofJoinNode using the default memory pool or the > memory > > > pool of the exec plan? It should be exclusively using the latter but > it's > > > easy sometimes to overlook using the default memory pool. It probably > > > won't make too much of a difference at the end of the day as benchmarks > > > normally configure an exec plan to use the default memory pool and so > the > > > two pools would be the same. > > > > > > > > > My expectation is that we would see some pretty sizable calls to > > > Allocate when we begin to read files or to create tables, but that is > not > > > evident. > > > > > > > > Yes, the materializtion step of an asof join uses array builders and > > > those will be allocating buffers from a memory pool. > > > > > > > > > 1) To my understanding, only large allocations will call Allocate. > Are > > > > > there allocations (for files, table objects), which despite being > of > > > > > large size, do not call Allocate? > > > > > > > > No. There is no size limit for the allocator. Instead, when people > > > were talking about "large allocations" and "small allocations" in the > > > previous thread is was more of a general concept. > > > > > > > > For example, if I create an array builder, add some items to it, and > > > then create an array then this will always use a memory pool for the > > > allocation. This will be true even if I create an array with a single > > > element in it (in which case the allocation is often padded for > alignment > > > purposes). > > > > > > > > On the other hand, schemas keep their fields in a std::vector which > > > never uses the memory pool for allocation. This is true even if I have > > > 10,000 columns and the vector's memory is actually quite large. > > > > > > > > However, in general, arrays tend to be quite large and schemas tend > to > > > be quite small. > > > > > > > > > 2) How can maximum_peak_memory be nonzero if we have not seen any > > > > > calls to Allocate/Reallocate/Free? > > > > > > > > I don't think that is possible. > > > > > > > > On Mon, Jul 11, 2022 at 10:44 AM Ivan Chau <ivan.m.c...@gmail.com> > > > wrote: > > > > > > > > > > Hi all, > > > > > > > > > > I've been doing some testing with LoggingMemoryPool to benchmark > our > > > > > AsOfJoin implementation > > > > > < > > > > https://github.com/apache/arrow/blob/master/cpp/src/arrow/compute/exec/asof_join_node.cc > > > >. > > > > > Our underlying memory pool for the LoggingMemoryPool is the > > > > > default_memory_pool (this is process-wide). > > > > > > > > > > Curiously enough, I don't see any allocations, reallocations, or > frees > > > > > when we run our benchmarking code. I also see that the max_memory > > > > > property of the memory pool (which is documented as the peak memory > > > > > allocation), is nonzero (1.2e9 bytes). > > > > > > > > > > My expectation is that we would see some pretty sizable calls to > > > > > Allocate when we begin to read files or to create tables, but that > is > > > not evident. > > > > > > > > > > 1) To my understanding, only large allocations will call Allocate. > Are > > > > > there allocations (for files, table objects), which despite being > of > > > > > large size, do not call Allocate? > > > > > > > > > > 2) How can maximum_peak_memory be nonzero if we have not seen any > > > > > calls to Allocate/Reallocate/Free? > > > > > > > > > > Thank you! > > > >