[jira] [Updated] (FLINK-12886) Support container memory segment

Flink Jira Bot (Jira) Mon, 07 Jun 2021 04:00:12 -0700


     [ 
https://issues.apache.org/jira/browse/FLINK-12886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Flink Jira Bot updated FLINK-12886:
-----------------------------------
    Labels: auto-unassigned pull-request-available stale-major  (was: 
auto-unassigned pull-request-available)

I am the [Flink Jira Bot|https://github.com/apache/flink-jira-bot/] and I help 
the community manage its development. I see this issues has been marked as 
Major but is unassigned and neither itself nor its Sub-Tasks have been updated 
for 30 days. I have gone ahead and added a "stale-major" to the issue". If this 
ticket is a Major, please either assign yourself or give an update. Afterwards, 
please remove the label or in 7 days the issue will be deprioritized.


> Support container memory segment
> --------------------------------
>
>                 Key: FLINK-12886
>                 URL: https://issues.apache.org/jira/browse/FLINK-12886
>             Project: Flink
>          Issue Type: New Feature
>          Components: Table SQL / Runtime
>            Reporter: Liya Fan
>            Priority: Major
>              Labels: auto-unassigned, pull-request-available, stale-major
>         Attachments: image-2019-06-18-17-59-42-136.png
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> We observe that in many scenarios, the operations/algorithms are based on an 
> array of MemorySegment. These memory segments form a large, combined, and 
> continuous memory space.
> For example, suppose we have an array of n memory segments. Memory addresses 
> from 0 to segment_size - 1 are served by the first memory segment; memory 
> addresses from segment_size to 2 * segment_size - 1 are served by the second 
> memory segment, and so on.
> Specific algorithms decide the actual MemorySegment to serve the operation 
> requests. For some rare cases, two or more memory segments serve the 
> requests. There are many operations based on such a paradigm, for example, 
> {{BinaryString#matchAt}}, {{SegmentsUtil#copyToBytes}}, 
> {{LongHashPartition#MatchIterator#get}}, etc.
> The problem is that, for memory segment array based operations, large amounts 
> of code is devoted to
> 1. Computing the memory segment index & offset within the memory segment.
>  2. Processing boundary cases. For example, to write an integer, there are 
> only 2 bytes left in the first memory segment, and the remaining 2 bytes must 
> be written to the next memory segment.
>  3. Differentiate processing for short/long data. For example, when copying 
> memory data to a byte array. Different methods are implemented for cases when 
> 1) the data fits in a single segment; 2) the data spans multiple segments.
> Therefore, there are much duplicated code to achieve above purposes. What is 
> worse, this paradigm significantly increases the amount of code, making the 
> code more difficult to read and maintain. Furthermore, it easily gives rise 
> to bugs which difficult to find and debug.
> To address these problems, we propose a new type of memory segment: 
> {{ContainerMemorySegment}}. It is based on an array of underlying memory 
> segments with the same size. It extends from the {{MemorySegment}} base 
> class, so it provides all the functionalities provided by {{MemorySegment}}. 
> In addition, it hides all the details for dealing with specific memory 
> segments, and acts as if it were a big continuous memory region.
> A prototype implementation is given below:
>  !image-2019-06-18-17-59-42-136.png|thumbnail! 
> With this new type of memory segment, many operations/algorithms can be 
> greatly simplified, without affecting performance. This is because,
> 1. Many checks, boundary processing are already there. We just move them to 
> the new class.
>  2. We optimize the implementation of the new class, so the special 
> optimizations (e.g. optimizations for short data) are still preserved.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-12886) Support container memory segment

Reply via email to