[
https://issues.apache.org/jira/browse/OOZIE-2662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15556025#comment-15556025
]
Robert Kanter commented on OOZIE-2662:
--------------------------------------
I'd be hesitant to simply skip rows that are have a problem because we often
have rows that depend on other rows, so we'd need to be more clever than simply
skipping bad rows. For example, if there's a bad CoordinatorJobBean and we
skip it, that will cause problems if we don't also skip all of it's child
CoordinatorActionBeans (because they'll have pointers to it).
We shouldn't have duplicate entries or other constraint violations, unless
someone went and messed with the dump data file or if it got corrupted somehow,
right? I'm thinking that if there's any problem with the data, we should bail
out and not import any of it.
> DB migration fails if DB is too big
> -----------------------------------
>
> Key: OOZIE-2662
> URL: https://issues.apache.org/jira/browse/OOZIE-2662
> Project: Oozie
> Issue Type: Bug
> Reporter: Peter Cseh
> Assignee: Andras Piros
> Attachments: OOZIE-2662.001.patch, OOZIE-2662.002.wip.patch
>
>
> The initial version of the DB import tool commits all the workflows, actions
> etc. in one huge commit. If it does not fits into the memory, AOOME is thrown.
> We should commit every 1k or 10k elements to prevent this.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)