[ 
https://issues.apache.org/jira/browse/DAFFODIL-2706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17559339#comment-17559339
 ] 

Steve Lawrence commented on DAFFODIL-2706:
------------------------------------------

To test Daffodil runtime1 memory usage, I created a saved parser with this 
command:
{code:bash}
daffodil save-parser -s path/to/schema.dfdl.xsd savedParser.bin
{code}
For parsing, I used this command:
{code:bash}
while true; do cat path/to/file.bin; done | daffodil parse -P savedParser.bin 
--stream -o /dev/null -
{code}
For unparsing, I used this command:
{code:bash}
while true; do cat path/to/file.xml; done | daffodil unparse -P savedParser.bin 
--stream -o /dev/null -
{code}
 
Note that the file.xml file has a null byte appended to the end needed for the 
--stream option.

JVM Heap space was set with the -Xmx flag, using Java 8.

The schema is a couple hundred lines with about 40 elements, and when complied 
and serialized is about 34 KB.

The parsed file is about 200 bytes. The unparsed file is about 4000 bytes.

Parse Numbers:
||JVM Heap Size||Time spent in GC||
|6MB|>95%|
|8MB|8%|
|16MB|3%|
|32MB|1%|
|64MB|<1%|
|128MB|<0.5%|

Unparse Numbers:
||JVM Heap Size||Time spent in GC||
|6MB|>80%|
|8MB|7%|
|16MB|2%|
|32MB|1%|
|64MB|<0.5%|
|128MB|<.05%|

With less than 6MB, the JVM could not start.

In all of these cases, the RES memory report by "top" varied but was usually 
around 200-250 MB. Adjusting the heap size did not seem to have a noticeable 
impact on this value, it seemed more to do with how long the process ran.

I did not do any performance testing or measurement of how much was 
parsed/unparsed, but minimizing time spent in garbage collection is pretty 
critical, so the higher memory values would likely perform better.

Based on these numbers, JVM heap size of 32MB seems like a lower limit for heap 
space, with a total of 250MB of memory required for things like 
JVM/jars/whatever else Java does.

It would be interesting to test this on more complex schemas/data to see how 
this scales.

> Memory size limits - measurements needed - doc online should publish it
> -----------------------------------------------------------------------
>
>                 Key: DAFFODIL-2706
>                 URL: https://issues.apache.org/jira/browse/DAFFODIL-2706
>             Project: Daffodil
>          Issue Type: Bug
>          Components: Documentation, QA
>    Affects Versions: 3.3.0
>            Reporter: Mike Beckerle
>            Priority: Major
>              Labels: beginner
>
> (Categorized this as a bug, because this is not a measurement we are making 
> and talking about in our documentation)
> Users have requested information about how small a JVM memory footprint can 
> run Daffodil, assuming loading of a precompiled DFDL schema. 
> We should have such a test for Runtime 1, and the results should be published 
> for each release. 
> This is of course somewhat schema dependent. That is a schema for large 
> messages which cannot be streamed for unparsing will have large footprint no 
> matter what. I think the interest is for small message data  - e.g. data that 
> is typical messages up to 1Kbyte in size. 
> The test needs to scrutinize time spent in Java Garbage Collection overhead, 
> as the memory should not be so small as to drive up the GC overhead level. 
> The interest comes from wanting to run Daffodil on devices using smaller CPUs 
> such as are found in embedded devices, phones, gateways, etc. (ARM and Atom 
> CPUs typically)



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

Reply via email to