[jira] [Work logged] (HDDS-2467) Allow running Freon validators with limited memory

ASF GitHub Bot (Jira) Wed, 13 Nov 2019 07:33:33 -0800


     [ 
https://issues.apache.org/jira/browse/HDDS-2467?focusedWorklogId=342655&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-342655
 ]


ASF GitHub Bot logged work on HDDS-2467:
----------------------------------------

                Author: ASF GitHub Bot
            Created on: 13/Nov/19 15:32
            Start Date: 13/Nov/19 15:32
    Worklog Time Spent: 10m 
      Work Description: adoroszlai commented on pull request #152: HDDS-2467. 
Allow running Freon validators with limited memory
URL: https://github.com/apache/hadoop-ozone/pull/152
 
 
   ## What changes were proposed in this pull request?
   
   1. Freon validators read each item to be validated completely into a 
`byte[]` buffer.  This allows timing only the read (and buffer allocation), but 
not the subsequent digest calculation.  However, it also means that memory 
required for running the validators is proportional to key size.  I propose to 
add a command-line flag (`-s` / `--stream`) which, when specified, makes Freon 
calculate the digest while reading the input stream.  This changes timing 
results a bit, since values will include the time required for digest 
calculation.  On the other hand, Freon will be able to validate huge keys with 
limited memory.
   2. Reduce the memory requirement of the non-stream version by allocating a 
buffer exactly the size of the key.  This adds a bit of overhead in time, since 
key info needs to be fetched, too.  But it eliminates `ByteArrayOutputStream`, 
which allocates incrementally larger and larger buffers.  The latter can lead 
to memory requirement twice the actual key size in the worst case (since `2^n > 
2^n-1 + 2^n-2 + ...`).
   3. Get rid of code duplication between `SameKeyReader` and 
`OzoneClientKeyValidator`.
   4. Allow `OzoneClientKeyGenerator` to create > 2GB keys.
   
   https://issues.apache.org/jira/browse/HDDS-2467
   
   ## How was this patch tested?
   
   Created and validated keys using Freon.  Verified that even 2.5GB key can be 
created and validated with `--stream`.  Verified that streaming is forced for 
such a large key, since it won't fit any array.  Verified that smaller keys can 
be validated both ways.
   
   ```
   export HADOOP_OPTS='-Xmx1024M -XX:+HeapDumpOnOutOfMemoryError'
   ozone freon ockg -t 1 -F ONE -n 1 -p 2_5GB -s 2684354560
   ozone freon ockg -t 1 -F ONE -n 1 -p 256MB -s  268435456
   ozone freon ockg -t 1 -F ONE -n 1 -p 128MB -s  134217728
   ozone freon ockg -t 1 -F ONE -n 1 -p  64MB -s   67108864
   ozone freon ockg -t 1 -F ONE -n 1 -p  10KB -s      10240
   
   export HADOOP_OPTS='-Xmx128M -XX:+HeapDumpOnOutOfMemoryError'
   ozone freon ockv -t 1 -n 1 -p  10KB
   ozone freon ockv -t 1 -n 1 -p  64MB
   
   export HADOOP_OPTS='-Xmx64M -XX:+HeapDumpOnOutOfMemoryError'
   ozone freon ockv -t 1 -n 1 -p  10KB -s
   ozone freon ockv -t 1 -n 1 -p  64MB -s
   ozone freon ockv -t 1 -n 1 -p 128MB -s
   ozone freon ockv -t 1 -n 1 -p 256MB -s
   ozone freon ockv -t 1 -n 1 -p 2_5GB -s
   ozone freon ockv -t 1 -n 1 -p 2_5GB
   
   ozone freon ockg -t 1 -F ONE -n 100 -p 1KB -s 1024
   ozone freon ockv -n 100 -p 1KB
   
   ozone freon ocokr -t 4 -k  '64MB/0' -n 32 -s
   ozone freon ocokr -t 8 -k '256MB/0' -n 16 -s
   
   export HADOOP_OPTS='-Xmx1024M -XX:+HeapDumpOnOutOfMemoryError'
   ozone freon ocokr -t 2 -k '256MB/0' -n 16
   ```
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


Issue Time Tracking
-------------------

            Worklog Id:     (was: 342655)
    Remaining Estimate: 0h
            Time Spent: 10m

> Allow running Freon validators with limited memory
> --------------------------------------------------
>
>                 Key: HDDS-2467
>                 URL: https://issues.apache.org/jira/browse/HDDS-2467
>             Project: Hadoop Distributed Data Store
>          Issue Type: Improvement
>          Components: freon
>            Reporter: Attila Doroszlai
>            Assignee: Attila Doroszlai
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> Freon validators read each item to be validated completely into a {{byte[]}} 
> buffer.  This allows timing only the read (and buffer allocation), but not 
> the subsequent digest calculation.  However, it also means that memory 
> required for running the validators is proportional to key size.
> I propose to add a command-line flag to allow calculating the digest while 
> reading the input stream.  This changes timing results a bit, since values 
> will include the time required for digest calculation.  On the other hand, 
> Freon will be able to validate huge keys with limited memory.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Work logged] (HDDS-2467) Allow running Freon validators with limited memory

Reply via email to