[ 
https://issues.apache.org/jira/browse/OAK-9095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17123919#comment-17123919
 ] 

Thomas Mueller commented on OAK-9095:
-------------------------------------

Thanks! My comments:

The main issue is, when the hard limit is reached, new entries are discarded, 
but writing is not prevented (no exception is thrown). I would throw an 
exception instead, so that writes fail and the transaction is not committed. 

There is no unit test. I guess it's hard to write one, but it's fairly 
important to ensure this works as expected... So maybe by injecting a very high 
number (near the different limits) into the data structure?

For performance, I would try to avoid multiple conditional checks in the normal 
case. Instead of

{noformat}
if (base.size() >= MapRecord.ERROR_SIZE_HARD_STOP) {
  A
} else if (base.size() >=  MapRecord.ERROR_SIZE) {
  B
} else if (base.size() >= MapRecord.WARN_SIZE) {
  C
}
{noformat}

I would use:

{noformat}
if (base.size() >= MapRecord.WARN_SIZE) {
    if (base.size() >= MapRecord.ERROR_SIZE_HARD_STOP) {
      B
    } else if (base.size() >=  MapRecord.ERROR_SIZE) {
      C
    } else {
      A
    }
}
{noformat}

> private long ts;

Nit: I wouldn't use abbreviations in field names, instead I would use 
"lastLogTime", and variable name "now".

> if ((ts - this.ts) / 1000 >= 1) {

(Division is slow, but the main reason is simplicity) what about "if (now - 
lastLogTime > 1000)" instead?


> MapRecord corruption when adding more than MapRecord.MAX_SIZE entries in 
> branch record
> --------------------------------------------------------------------------------------
>
>                 Key: OAK-9095
>                 URL: https://issues.apache.org/jira/browse/OAK-9095
>             Project: Jackrabbit Oak
>          Issue Type: Bug
>          Components: segment-tar
>    Affects Versions: 1.22.3, 1.8.22, 1.30.0
>            Reporter: Andrei Dulceanu
>            Assignee: Andrei Dulceanu
>            Priority: Major
>         Attachments: OAK-9095.patch
>
>
> It is now possible to write a {{MapRecord}} with a huge number of entries, 
> going over the maximum limit, {{MapRecord.MAX_SIZE}}, (i.e. 536.870.911 
> entries). This issue stems from the fact that the number of entries is 
> checked when writing a map leaf record [0], but not when writing a map branch 
> record [1]. When more than {{MapRecord.MAX_SIZE}} entries are written in a 
> branch record [2], the {{entrycCount}} overflows in the first bit of the 
> level, essentially rendering the entire HAMT structure corrupt, since the 
> root branch record will be stored now at level 1, instead of level 0, 
> reporting an incorrect size as well (i.e. actual size - 
> {{MapRecord.MAX_SIZE}}).
> Since this is a hard limit of the segment store and going above this number 
> would mean rewriting the internals of the HAMT structure currently in use, I 
> propose the following mitigation:
> * add a size check for the branch record to not allow going over the limit
> * log a warning when the number of entries goes over 400.000.000
> * log an error when the number of entries goes over 500.000.000 and do not 
> allow any write operations on the node
> * allow further writes only if 
> {{oak.segmentNodeStore.allowWritesOnHugeMapRecord}} system property is present
> [0] 
> https://github.com/apache/jackrabbit-oak/blob/1.22/oak-segment-tar/src/main/java/org/apache/jackrabbit/oak/segment/DefaultSegmentWriter.java#L284
> [1] 
> https://github.com/apache/jackrabbit-oak/blob/1.22/oak-segment-tar/src/main/java/org/apache/jackrabbit/oak/segment/DefaultSegmentWriter.java#L291
> [2] 
> https://github.com/apache/jackrabbit-oak/blob/1.22/oak-segment-tar/src/main/java/org/apache/jackrabbit/oak/segment/RecordWriters.java#L231



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to