[
https://issues.apache.org/jira/browse/DAFFODIL-2468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Steve Lawrence resolved DAFFODIL-2468.
--------------------------------------
Fix Version/s: 3.1.0
Resolution: Fixed
Fixed in commit 0a390f67a28f1f65b84b52583917beda34de1a88
> Uparsing an infoset for an 800mb csv file runs out of memory
> -------------------------------------------------------------
>
> Key: DAFFODIL-2468
> URL: https://issues.apache.org/jira/browse/DAFFODIL-2468
> Project: Daffodil
> Issue Type: Bug
> Affects Versions: 3.1.0
> Reporter: Dave Thompson
> Assignee: Steve Lawrence
> Priority: Major
> Fix For: 3.1.0
>
> Attachments: csv_data800m.csv.gz
>
>
> While verifying DAFFODIL-2455 - - Large CSV file causes "Attempting to
> backtrack too far" exception, found that unparsing the successfully parsed
> 800mb CSV files infoset ran out of memory.
> Increased the DAFFODIL_JAVA_OPTS memory setting several time up to 32gb and
> tried unparsing the infoset, each time running out of memory. Ran on test
> platform which has 90+GB of memory.
> Parsed and unparsed using the shema from dfdl-shemas/dfdl-csv repo.
> The 800gb csv file (csv_data800m.csv) gzipped.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)