[ 
https://issues.apache.org/jira/browse/APEXMALHAR-2190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15573426#comment-15573426
 ] 

bright chen commented on APEXMALHAR-2190:
-----------------------------------------

Add new features
  - Add ByteStreamMonitor to monitor all byte stream. function include 
getTotalSize(), getTotalCapacity(), listStreamSize(), listStreamCapacity()
  - Add interface BlocksAdjustStrategy and its implementation 
DefaultBlocksAdjustStrategy to adjust(release) the extra memory
  - Add dataSizeUpToWindow() and dataSizeOfWindow() in WindowableBlocksStream 
to support query data size of windows from other threads
  - Modified Bucket.DefaultBucket.freeMemory(long) to calculate the size of the 
data can be freed and send reset quest to the main thread

> Use reusable buffer to serial spillable data structure
> ------------------------------------------------------
>
>                 Key: APEXMALHAR-2190
>                 URL: https://issues.apache.org/jira/browse/APEXMALHAR-2190
>             Project: Apache Apex Malhar
>          Issue Type: Task
>            Reporter: bright chen
>            Assignee: bright chen
>   Original Estimate: 240h
>  Remaining Estimate: 240h
>
> Spillable Data Structure created lots of temporary memory to serial data lot 
> of of memory copy( see SliceUtils.concatenate(byte[], byte[]). Which used up 
> memory very quickly. See APEXMALHAR-2182.
> Use a shared memory to avoid allocate temporary memory and memory copy
> some basic ideas
> - SerToLVBuffer interface provides a method serTo(T object, LengthValueBuffer 
> buffer): instead of create a memory and then return the serialized data, this 
> method let the caller pass in the buffer. So different objects or object with 
> embed objects can share the same LengthValueBuffer
> - LengthValueBuffer: It is a buffer which manage the memory as length and 
> value(which is the generic format of serialized data). which provide length 
> placeholder mechanism to avoid temporary memory and data copy when the length 
> can be know after data serialized
> - memory management classes: includes interface ByteStream and it's 
> implementations: Block, FixedBlock, BlocksStream. Which provides a mechanism 
> to dynamic allocate and manage memory. Which basically provides following 
> function. I tried other some other stream mechamism such as 
> ByteArrayInputStream, but it can meet 3rd criteria, and don't have good 
> performance(50% loss) 
>   - dynamic allocate memory
>   - reset memory for reuse
>   - BlocksStream make sure the output slices will not be changed when need 
> extra memory; Block can change the reference of output slices buffer is data 
> was moved due to reallocate of memory(BlocksStream is better solution).
>   - WindowableBlocksStream extends from BlocksStream and provides function to 
> reset memory window by window instead of reset all memory. It provides 
> certain amount of cache( as bytes ) in memory



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to