[jira] [Closed] (DAFFODIL-2468) Uparsing an infoset for an 800mb csv file runs out of memory

Dave Thompson (Jira) Wed, 10 Feb 2021 14:10:05 -0800


     [ 
https://issues.apache.org/jira/browse/DAFFODIL-2468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Dave Thompson closed DAFFODIL-2468.
-----------------------------------

Verified the specified commit (commit 0a390f67a28f1f65b84b52583917beda34de1a88) 
is included in the latest pull from the incubator-daffodil repository.

Verified the daffodil-runtime1 sub-project sbt test execute successfully.

Verified the two 800mb csv infosets (including the attached) now successfully 
unparse on the nightly test platform with the original DAFFODIL_JAVA_OPTS 
memory setting.

> Uparsing an infoset for an 800mb csv file runs out of memory 
> -------------------------------------------------------------
>
>                 Key: DAFFODIL-2468
>                 URL: https://issues.apache.org/jira/browse/DAFFODIL-2468
>             Project: Daffodil
>          Issue Type: Bug
>          Components: Back End
>    Affects Versions: 3.1.0
>            Reporter: Dave Thompson
>            Assignee: Steve Lawrence
>            Priority: Major
>             Fix For: 3.1.0
>
>         Attachments: csv_data800m.csv.gz
>
>
> While verifying DAFFODIL-2455 - - Large CSV file causes "Attempting to 
> backtrack too far" exception, found that unparsing the successfully parsed 
> 800mb CSV files infoset ran out of memory.
> Increased the DAFFODIL_JAVA_OPTS memory setting several time up to 32gb and 
> tried unparsing the infoset, each time running out of memory. Ran on test 
> platform which has 90+GB of memory. 
> Parsed and unparsed using the shema from dfdl-shemas/dfdl-csv repo.
> The 800gb csv file (csv_data800m.csv) gzipped.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Closed] (DAFFODIL-2468) Uparsing an infoset for an 800mb csv file runs out of memory

Reply via email to