[ https://issues.apache.org/jira/browse/OAK-9095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17123919#comment-17123919 ]
Thomas Mueller commented on OAK-9095: ------------------------------------- Thanks! My comments: The main issue is, when the hard limit is reached, new entries are discarded, but writing is not prevented (no exception is thrown). I would throw an exception instead, so that writes fail and the transaction is not committed. There is no unit test. I guess it's hard to write one, but it's fairly important to ensure this works as expected... So maybe by injecting a very high number (near the different limits) into the data structure? For performance, I would try to avoid multiple conditional checks in the normal case. Instead of {noformat} if (base.size() >= MapRecord.ERROR_SIZE_HARD_STOP) { A } else if (base.size() >= MapRecord.ERROR_SIZE) { B } else if (base.size() >= MapRecord.WARN_SIZE) { C } {noformat} I would use: {noformat} if (base.size() >= MapRecord.WARN_SIZE) { if (base.size() >= MapRecord.ERROR_SIZE_HARD_STOP) { B } else if (base.size() >= MapRecord.ERROR_SIZE) { C } else { A } } {noformat} > private long ts; Nit: I wouldn't use abbreviations in field names, instead I would use "lastLogTime", and variable name "now". > if ((ts - this.ts) / 1000 >= 1) { (Division is slow, but the main reason is simplicity) what about "if (now - lastLogTime > 1000)" instead? > MapRecord corruption when adding more than MapRecord.MAX_SIZE entries in > branch record > -------------------------------------------------------------------------------------- > > Key: OAK-9095 > URL: https://issues.apache.org/jira/browse/OAK-9095 > Project: Jackrabbit Oak > Issue Type: Bug > Components: segment-tar > Affects Versions: 1.22.3, 1.8.22, 1.30.0 > Reporter: Andrei Dulceanu > Assignee: Andrei Dulceanu > Priority: Major > Attachments: OAK-9095.patch > > > It is now possible to write a {{MapRecord}} with a huge number of entries, > going over the maximum limit, {{MapRecord.MAX_SIZE}}, (i.e. 536.870.911 > entries). This issue stems from the fact that the number of entries is > checked when writing a map leaf record [0], but not when writing a map branch > record [1]. When more than {{MapRecord.MAX_SIZE}} entries are written in a > branch record [2], the {{entrycCount}} overflows in the first bit of the > level, essentially rendering the entire HAMT structure corrupt, since the > root branch record will be stored now at level 1, instead of level 0, > reporting an incorrect size as well (i.e. actual size - > {{MapRecord.MAX_SIZE}}). > Since this is a hard limit of the segment store and going above this number > would mean rewriting the internals of the HAMT structure currently in use, I > propose the following mitigation: > * add a size check for the branch record to not allow going over the limit > * log a warning when the number of entries goes over 400.000.000 > * log an error when the number of entries goes over 500.000.000 and do not > allow any write operations on the node > * allow further writes only if > {{oak.segmentNodeStore.allowWritesOnHugeMapRecord}} system property is present > [0] > https://github.com/apache/jackrabbit-oak/blob/1.22/oak-segment-tar/src/main/java/org/apache/jackrabbit/oak/segment/DefaultSegmentWriter.java#L284 > [1] > https://github.com/apache/jackrabbit-oak/blob/1.22/oak-segment-tar/src/main/java/org/apache/jackrabbit/oak/segment/DefaultSegmentWriter.java#L291 > [2] > https://github.com/apache/jackrabbit-oak/blob/1.22/oak-segment-tar/src/main/java/org/apache/jackrabbit/oak/segment/RecordWriters.java#L231 -- This message was sent by Atlassian Jira (v8.3.4#803005)