Thanks Weston.

We are mostly interested in the total memory allocation/usage for an
end-to-end "read-asofjoin-write" use case and will revert  the table source
node (since we don't want to load all data into memory).



On Mon, Jul 11, 2022 at 5:33 PM Weston Pace <weston.p...@gmail.com> wrote:

> I suppose it depends on your goal.
>
> My earlier feedback was that doing a true scan is often detrimental
> for benchmarking since I/O time can often dominate the results.  Also,
> to get the best scan results, you often spend a lot of time
> micromanaging the file format / compression / file layout / etc.  That
> was why I had recommended going with a TableSourceNode if you were
> build a benchmark to focus understanding of a single node.
>
> On the other hand, if your goal is understanding end-to-end query
> times, then a table source node is probably not what you would start
> with.
>
> One useful number, regardless of how you are inputting your data, is
> the "total size of all data".  You wouldn't get that from a memory
> pool though.  You could get that by calling the utilities in
> src/arrow/util/byte_size.h on your table.  This might give you
> something to compare/contrast allocation of an individual node with.
>
> On Mon, Jul 11, 2022 at 2:04 PM Li Jin <ice.xell...@gmail.com> wrote:
> >
> > > TableSourceNode wouldn't need to allocate since it runs against memory
> > that's already been allocated.
> > Is the memory "that is already allocated" tracked in any allocators? For
> an
> > end to end benchmark of "scan - join - write" I think would make sense to
> > include all arrow memory allocation (if that makes sense)
> >
> > On Mon, Jul 11, 2022 at 4:37 PM Weston Pace <weston.p...@gmail.com>
> wrote:
> >
> > > > Is there anything else I'd need to change?
> > >
> > > Maybe try something like this:
> > >
> > >
> https://github.com/westonpace/arrow/commit/15ac0d051136c585cda63297e48f17557808d898
> > >
> > > > Beyond that, we should also expect to see some allocations from
> > > TableSourceNode going through the logging memory pool, even if
> AsOfJoinNode
> > > was using the default memory pool instead of the Exec Plan's pool, but
> I am
> > > not seeing anything come through...
> > >
> > > TableSourceNode wouldn't need to allocate since it runs against memory
> > > that's already been allocated.  It might split input into smaller
> > > batches but slicing tables / arrays is a zero-copy operation that does
> > > not require allocating new buffers.
> > >
> > > On Mon, Jul 11, 2022 at 12:46 PM Ivan Chau <ivan.c...@twosigma.com>
> wrote:
> > > >
> > > > Yeah this behavior is certainly a bit strange then.
> > > >
> > > > The only alteration I am making is changing the way we create the
> > > Execution Context in the benchmark file.
> > > >
> > > > Something like:
> > > >
> > > > ```
> > > > auto logging_pool = LoggingMemoryPool(default_memory_pool());
> > > > ExecContext ctx(&logging_pool, ...);
> > > > ```
> > > >
> > > > Is there anything else I'd need to change?
> > > >
> > > > Beyond that, we should also expect to see some allocations from
> > > TableSourceNode going through the logging memory pool, even if
> AsOfJoinNode
> > > was using the default memory pool instead of the Exec Plan's pool, but
> I am
> > > not seeing anything come through...
> > > >
> > > > -----Original Message-----
> > > > From: Weston Pace <weston.p...@gmail.com>
> > > > Sent: Monday, July 11, 2022 2:47 PM
> > > > To: dev@arrow.apache.org
> > > > Subject: Re: cpp Memory Pool Clarification
> > > >
> > > > Are you changing the default memory pool to a LoggingMemoryPool?
> > > > Where are you doing this?  For a benchmark I think you would need to
> > > change the implementation in the benchmark file itself.
> > > >
> > > > Similarly, is AsofJoinNode using the default memory pool or the
> memory
> > > pool of the exec plan?  It should be exclusively using the latter but
> it's
> > > easy sometimes to overlook using the default memory pool.  It probably
> > > won't make too much of a difference at the end of the day as benchmarks
> > > normally configure an exec plan to use the default memory pool and so
> the
> > > two pools would be the same.
> > > >
> > > > > My expectation is that we would see some pretty sizable calls to
> > > Allocate when we begin to read files or to create tables, but that is
> not
> > > evident.
> > > >
> > > > Yes, the materializtion step of an asof join uses array builders and
> > > those will be allocating buffers from a memory pool.
> > > >
> > > > > 1) To my understanding, only large allocations will call Allocate.
> Are
> > > > > there allocations (for files, table objects), which despite being
> of
> > > > > large size, do not call Allocate?
> > > >
> > > > No.  There is no size limit for the allocator.  Instead, when people
> > > were talking about "large allocations" and "small allocations" in the
> > > previous thread is was more of a general concept.
> > > >
> > > > For example, if I create an array builder, add some items to it, and
> > > then create an array then this will always use a memory pool for the
> > > allocation.  This will be true even if I create an array with a single
> > > element in it (in which case the allocation is often padded for
> alignment
> > > purposes).
> > > >
> > > > On the other hand, schemas keep their fields in a std::vector which
> > > never uses the memory pool for allocation.  This is true even if I have
> > > 10,000 columns and the vector's memory is actually quite large.
> > > >
> > > > However, in general, arrays tend to be quite large and schemas tend
> to
> > > be quite small.
> > > >
> > > > > 2) How can maximum_peak_memory be nonzero if we have not seen any
> > > > > calls to Allocate/Reallocate/Free?
> > > >
> > > > I don't think that is possible.
> > > >
> > > > On Mon, Jul 11, 2022 at 10:44 AM Ivan Chau <ivan.m.c...@gmail.com>
> > > wrote:
> > > > >
> > > > > Hi all,
> > > > >
> > > > > I've been doing some testing with LoggingMemoryPool to benchmark
> our
> > > > > AsOfJoin implementation
> > > > > <
> > >
> https://github.com/apache/arrow/blob/master/cpp/src/arrow/compute/exec/asof_join_node.cc
> > > >.
> > > > > Our underlying memory pool for the LoggingMemoryPool is the
> > > > > default_memory_pool (this is process-wide).
> > > > >
> > > > > Curiously enough, I don't see any allocations, reallocations, or
> frees
> > > > > when we run our benchmarking code. I also see that the max_memory
> > > > > property of the memory pool (which is documented as the peak memory
> > > > > allocation), is nonzero (1.2e9 bytes).
> > > > >
> > > > > My expectation is that we would see some pretty sizable calls to
> > > > > Allocate when we begin to read files or to create tables, but that
> is
> > > not evident.
> > > > >
> > > > > 1) To my understanding, only large allocations will call Allocate.
> Are
> > > > > there allocations (for files, table objects), which despite being
> of
> > > > > large size, do not call Allocate?
> > > > >
> > > > > 2) How can maximum_peak_memory be nonzero if we have not seen any
> > > > > calls to Allocate/Reallocate/Free?
> > > > >
> > > > > Thank you!
> > >
>

Reply via email to