[ https://issues.apache.org/jira/browse/DAFFODIL-2851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Mike Beckerle reassigned DAFFODIL-2851: --------------------------------------- Assignee: Steve Lawrence > Excessive alloations in StringOfSpecifiedLengthMixin > ---------------------------------------------------- > > Key: DAFFODIL-2851 > URL: https://issues.apache.org/jira/browse/DAFFODIL-2851 > Project: Daffodil > Issue Type: Bug > Components: Back End, Performance > Reporter: Steve Lawrence > Assignee: Steve Lawrence > Priority: Major > Fix For: 3.7.0 > > > The StringOfSpecifiedLengthMixin passes in the value of the > "maximumSimpleElementSizeInCharacters" tunable to the getSomeString function: > https://github.com/apache/daffodil/blob/main/daffodil-runtime1/src/main/scala/org/apache/daffodil/runtime1/processors/parsers/StringLengthParsers.scala#L89-L94 > The getSomeString function calls withLocalCharBuffer which allocates a char > buffer of that size where it will decode the string. Currently, the tunable > defaults to 1MB. This size is pretty large, large enough to be a noticeable > contributor to allocations and cpu usage when profiling. > Fortunately, the allocated char buffer is cached and reused during the parse > (though each parse allocates a new one), so it's only a one time penalty per > parse. But most files are not going to have single strings nearly that large > so this large allocation is just a waste. > We should consider ways to reduce this allocation. Maybe simply decrease the > tunable? Or maybe change the logic so StringOfSpecifiedLength allocates a > much smaller amount, and grows the buffer if needed, maybe taking into > account bitLimit? Or maybe the buffer is shared among different parses in a > ThreadLocal, so we still allocate a large buffer, but the penalty is only > once per thread instead of once per parse? Likely other options... -- This message was sent by Atlassian Jira (v8.20.10#820010)