Hi Wes, I can absolutely do that sure. Let me come up with some examples and write back. You might be interested to read this, which is still a WIP, but, most C++ developers follow these guidelines (even before there was a core guidelines)
https://github.com/isocpp/CppCoreGuidelines/blob/master/CppCoreGuidelines.md I guess everything I am saying is just a guideline: https://github.com/isocpp/CppCoreGuidelines/blob/master/CppCoreGuidelines.md#Rsl-lib On Wed, Jul 27, 2016 at 11:25 PM Wes McKinney <[email protected]> wrote: > hi Ryan, > > comments inline > > On Wed, Jul 27, 2016 at 1:36 PM, Ryan Lewis <[email protected]> wrote: > > Hi, > > > > I am interested in contributing to parquet-cpp. I would like to introduce > > myself to the developer community, and as per suggestion from a private > > conversation with Wes McKinney, briefly discuss where I think I can add > > some value. What I have noticed is that the library is not leveraging the > > C++ standard library, and instead is recreating standard behavior. Two > > concrete examples are the InputStream class and its children, and the > > Buffer objects. These are perhaps modeled by the input iterator concept: > > http://en.cppreference.com/w/cpp/concept/InputIterator (or ideally some > > refinement of it) and the allocator concept > > http://en.cppreference.com/w/cpp/concept/Allocator. Components of the > > standard library which implements these concepts are listed in the > previous > > links. > > > > Can you describe or give some concrete examples of how places where > the current parquet-cpp MemoryAllocator concept could be altered to > use a stateful allocator implementation conforming to std::allocator / > std::allocator_traits? It is not clear to me that we are recreating > standard behavior; we have a working approach to user-defined memory > allocation and tracking that is distinct from the STL approach, even > if there is semantic functional overlap. We have discussed creating a > conforming STL allocator containing a pointer to a particular > parquet::MemoryAllocator in its state, and I think this is a good idea > independently (we are using the default allocator in a number of > places, which is bad because these allocations may not be known to the > user's custom allocator). > > The intent of these classes is to provide users a way to define custom > implementations / subclasses of concepts that are not known to the > library at compile time (and instead using dynamic dispatch / virtual > functions to invoke the correct symbols). The risk I see is that if > you consistently use the STL template-based approach to defining > custom allocators, then this will cascade throughout much of the > library (since any current class which utilizes > parquet::MemoryAllocator will now need to know the type of the user > allocator at compile time). If parquet-cpp were a self-contained > system (rather than intended as a pluggable component), it would be a > different story. Maybe I'm thinking about it wrong, this is not my > area of expertise, having only browsed the STL documentation on custom > allocators. > > Here is an example of a more advanced memory management subsystem in > Apache Kudu (see the abstract interface in BufferAllocator): > > > https://github.com/apache/incubator-kudu/blob/master/src/kudu/util/memory/memory.h > > See also the STL-compliant allocator that hooks into Kudu's memory arenas > here: > > > https://github.com/apache/incubator-kudu/blob/master/src/kudu/util/memory/arena.h > Usage: > https://github.com/apache/incubator-kudu/blob/master/src/kudu/tablet/tablet_metrics.cc#L275 > > (note that Kudu is a self-contained system so not apples-to-apples) > > The parquet::Buffer seems slightly orthogonal to memory allocators -- > this was intended to make the semantics of memory ownership and > deallocation more flexible where needed. > > > There is a decent amount of code in parquet-cpp and so its a bit > > challenging for me to provide a constructive thought on how to perform > > refactors (which I would happily do), to have the library use the > standard > > library more, however, I also don't want to do something if it is not a > > welcome change. Therefore feedback from the developer community would be > > beneficial. > > I would like to understand the argument a bit better and see some code > specifics. Would be great to get other C++ developers in the ASF > community to weigh in once we've identified the central pros / cons. > > Thanks > Wes > > > > > Best, > > -rhl >
