[
https://issues.apache.org/jira/browse/MINIFICPP-929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Marton Szasz resolved MINIFICPP-929.
------------------------------------
Resolution: Won't Fix
abandoned
> Create memory map interface to flow files in ProcessSession/ContentRepository
> -----------------------------------------------------------------------------
>
> Key: MINIFICPP-929
> URL: https://issues.apache.org/jira/browse/MINIFICPP-929
> Project: Apache NiFi MiNiFi C++
> Issue Type: Improvement
> Reporter: Andrew Christianson
> Assignee: Andrew Christianson
> Priority: Minor
> Time Spent: 1h 10m
> Remaining Estimate: 0h
>
> Currently, MiNiFi - C++ only support stream-oriented i/o to FlowFile
> payloads. This can limit performance in cases where in-place access to the
> payload is desirable. In cases where data can be accessed randomly and
> in-place, a significant speedup can be realized by mapping the payload into
> system memory address space. This is natively supported at the kernel level
> in Linux, MacOS, and Windows via the mmap() interface on files. Other
> repositories, such as the VolatileRepository, already store the entire
> payload in memory, so it is natural to pass through this memory block as if
> it were a memory-mapped file. While the DatabaseContentRepostory does not
> appear to natively support a memory map interface, accesses via an emulated
> memory-map interface should be possible with no performance degradation with
> respect to a full read via the streaming interface.
> Cases where in-place, random access is beneficial include, but are not
> limited to:
> * in-place parsing of JSON (e.g. RapidJSON supports parsing in-place, at
> least for strings).
> * access of payload via protocol buffers
> * random access of large files on disk, where it would otherwise require
> many seek() and read() syscalls
> The interface should be accessible by processors via a mmap() call on
> ProcessSession (adjacent to read() and write()). A MemoryMapCallback should
> be provided, which is called back via a process() call where the argument is
> an instance of BaseMemoryMap. The BaseMemoryMap is extended for each type of
> repository that MiNiFi - C++ supports, including: FileSystemRepository,
> VolatileRepository, and DatabaseContentRepository.
> As part of the change, in addition to extensive unit test coverage,
> benchmarks should be written such that the performance impact can be
> empirically measured and evaluated.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)