Hi Colin, The reason your first two classes use more memory is because MallocMessageBuilder pre-allocates space, and your messages are probably much smaller than what it pre-allocates. You can pass values to MallocMessageBuilder's constructor to tell it to allocate less space.
That said, you should check out the OwnCapnp class in Sandstorm's code, which I think does exactly what you're looking for here: https://github.com/sandstorm-io/sandstorm/blob/4d86a8144cdb43120ea12845738d0fe4a6ffcda1/src/sandstorm/util.h#L501-L525 This class allocates exactly as much space as is needed, does only one copy, and guarantees that the resulting copy has no out-of-bounds pointers, which allows it to construct a Reader that skips bounds checking for better performance. (This is still safe to use with untrusted input, as all pointers are validated during the copy step.) I really should add this utility type to Cap'n Proto itself... As for your memory leaks, I don't think your code is leaking memory. It looks like it is handling ownership correctly and, as you said, valgrind claims no leaks. You say that memory usage stays high even after deleting the objects, but this is typical: many memory allocators do not return memory to the system, or only do so after some time has passed or under other specific conditions. Fragmentation can indeed make it harder to return memory to the system. So, it's pretty common that a process's memory usage tells you the maximum memory it ever allocated, rather than the amount currently allocated. If you absolutely must return memory to the system promptly, you may need to customize your memory allocation. If you allocate memory using mmap() instead of malloc(), then the corresponding munmap() will definitely return the memory. But, of course, mmap() can only allocate in multiples of the system page size, so then it's up to you how to pack smaller objects into that space. Cap'n Proto gives you a lot of control over memory allocation by writing custom subclasses of MessageReader and MessageBuilder, or by passing in your own scratch space, which might make the job easier. -Kenton On Wed, Apr 17, 2019 at 2:23 PM <[email protected]> wrote: > Hi in my use case I would like to read messages from a stream, selectively > store certain structs from some messages in a dictionary for later use. > Unfortunately I cannot get it to work without leaking memory. The total > number of messages in the dictionary is bounded, as new messages with the > same key would replace the old message. (Actually valgrind doesn't think it > is leaking, but I can see my memory usage is increasing indefinitely). > > Test code > > messages are stored in a vector `data` > > > struct ProcessStream { > using CapnpMsg = capnp_utils::CapnpMessage<Bar>; > std::vector<std::shared_ptr<CapnpMsg>> data; > > > bool process_message( > kj::BufferedInputStreamWrapper& buffered_stream, kj::Array<capnp:: > word>& scratch > ) { > capnp::InputStreamMessageReader reader(buffered_stream, capnp:: > ReaderOptions(), scratch.asPtr()); > > > const auto ts_msg = reader.getRoot<Foo>(); > auto mir_msg = ts_msg.getBar(); > auto capnp_msg = std::make_shared<CapnpMsg>(mir_msg); > data.emplace_back(capnp_msg); > if (data.size() > 1000000) { > return false; > } > > > return true; > } > > > void start() { > // 1 MB of scratch. > kj::Array<capnp::word> scratch = kj::heapArray<capnp::word>(1024 * > 1024 / sizeof(capnp::word)); > kj::FdInputStream fd_stream(fileno(stdin)); > kj::BufferedInputStreamWrapper buffered_stream(fd_stream); > > > while (buffered_stream.tryGetReadBuffer() != nullptr) { > if (!process_message(buffered_stream, scratch)) { > break; > } > } > } > }; > > > > void foo() { > ProcessStream stream; > stream.start(); > std::cout << "A" << std::endl; > usleep(1000 * 1000 * 5); > stream.data.clear(); > } > > > int main(int argc, char* argv[]) { > > > foo(); > std::cout << "B" << std::endl; > usleep(1000 * 1000 * 10); > } > > > The following are the 3 ways to use a wrapper class to store the messages > > > > > > namespace capnp_utils { > > /// 1 & 2 takes 10x more memory. > > template<typename R> > class CapnpMessage2 { > // TODO: Should be private, but leave it public for debugging. > public: > typename R::Reader msg_reader; > std::unique_ptr<capnp::MallocMessageBuilder> msg_builder; > > public: > CapnpMessage2(const typename R::Reader& reader) { > msg_builder = std::make_unique<capnp::MallocMessageBuilder > >(); > msg_builder->setRoot(reader); > msg_reader = msg_builder->getRoot<R>().asReader(); > } > > > const typename R::Reader& get() const { > return msg_reader; > } > }; > > > template<typename R> > class CapnpMessage1 { > > public: > typename R::Reader struct_reader; > std::unique_ptr<capnp::MallocMessageBuilder> msg_builder; > public: > > CapnpMessage1(const typename R::Reader& reader) { > capnp::MallocMessageBuilder tmp; > tmp.setRoot(reader); > const auto raw = capnp::messageToFlatArray(tmp); > > > msg_builder = std::make_unique<capnp::MallocMessageBuilder > >(); > capnp::initMessageBuilderFromFlatArrayCopy(raw, * > msg_builder); > struct_reader = msg_builder->getRoot<R>().asReader(); > } > > > const typename R::Reader& get() const { > return struct_reader; > } > }; > > > > > template<typename R> > class CapnpMessage { > > public: > typename R::Reader struct_reader; > // `struct_reader` is only valid when `msg_reader` is alive. > std::unique_ptr<capnp::FlatArrayMessageReader> msg_reader; > kj::Array<capnp::word> raw; > public: > > > CapnpMessage(const typename R::Reader& reader) { > capnp::MallocMessageBuilder msg_builder; > msg_builder.setRoot(reader); > raw = capnp::messageToFlatArray(msg_builder); > msg_reader = std::make_unique<capnp:: > FlatArrayMessageReader>(raw); > struct_reader = msg_reader->getRoot<R>(); > } > > const typename R::Reader& get() const { > return struct_reader; > } > }; > > } > > `CapnpMessage2` copies the message once, while `CapnpMessage1` and > `CapnpMessage` both copy the message twice. > > In terms of memory usage: > A B > CapnpMessage2 8152MB 8136MB > CapnpMessage1 8166MB 8151MB > CapnpMessage 640MB 625MB > > > It looks that `CapnpMessage1` & `CapnpMessage2` use far more memory than > `CapnpMessage`. Is it because the builder has reserved too much space even > though I only need it to be as big as the reader? > The important part is that in all 3 cases, memory are still used at time B > although `data` is cleared. (Even if I reset the `unique_ptr` inside the > wrapper class). > > Is it somehow related to memory fragmentation? > > What is the best approach in this case? I would very much appreciate any > help to reduce the steady memory increase. > > -- > You received this message because you are subscribed to the Google Groups > "Cap'n Proto" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > Visit this group at https://groups.google.com/group/capnproto. > -- You received this message because you are subscribed to the Google Groups "Cap'n Proto" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. Visit this group at https://groups.google.com/group/capnproto.
