[
https://issues.apache.org/jira/browse/OAK-4582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16031154#comment-16031154
]
Francesco Mari commented on OAK-4582:
-------------------------------------
The first part of the refactoring is now open for review on GitHub. This is
what I've done so far:
* {{RawRecordWriter}} and {{RawRecordReader}} have been created to respectively
serialize records into and deserialize records (or parts of a record) from
anything that can satisfy a very small set of interfaces. The interfaces these
two classes rely on abstract the primitive operations that we need to write
records into segments.
* The interfaces required by {{RawRecordReader}} and {{RawRecordWriter}} are
currently implemented by {{RecordReader}} and {{SegmentBufferWriter}}. The code
currently uses the new serializing/deserializing logic but didn't switch the
way segments are represented (read below).
* Many tests have been introduced to test the serialization/deserialization of
records at the lowest possible granularity level. If a single bit would flip
somewhere, these tests will be able to pinpoint in which kind of record and in
which field of that record. Moreover, no file system is required to run these
tests.
* {{SegmentReader}} and {{SegmentWriter}} have been introduced. These classes
represent respectively a read-only and a read-write segment, but they are not
currently used in production code. Enough tests have been written, though, to
prove that their behaviour is identical to the one of the classes that
currently serialize/deserialize segments.
There is more stuff still to do:
* {{SegmentReader}} and {{SegmentWriter}} need to be used as the lower level
abstraction to read and write segments. This should be easy, since most of the
code uses {{RawRecordWriter}} and {{RawRecordReader}} directly, which work
seamlessly on top of {{SegmentReader}} and {{SegmentWriter}}.
* Modify the rest of the production code to always use {{SegmentAccess}} when
possible, to be able to hide if a read-only or a read-write segment is used.
This point requires a little bit of thinking, and I'm not sure at the moment
how to tackle this and how invasive this change will be.
[~mduerig], in the meantime, it would be great if you could go through my
changes. I will be happy to clarify either here or in GitHub.
> Split Segment in a read-only and a read-write implementations
> -------------------------------------------------------------
>
> Key: OAK-4582
> URL: https://issues.apache.org/jira/browse/OAK-4582
> Project: Jackrabbit Oak
> Issue Type: Technical task
> Components: segment-tar
> Reporter: Francesco Mari
> Assignee: Francesco Mari
> Labels: technical_debt
> Fix For: 1.8
>
> Attachments: benchmark-01.png, benchmark-01.txt
>
>
> {{Segment}} is central to the working of the Segment Store, but it currently
> serves two purposes:
> # It is a temporary storage location for the currently written segment,
> waiting to be full and flushed to disk.
> # It is a way to parse serialzed segments read from disk.
> To distinguish these two use cases, I suggest to promote {{Segment}} to the
> status of interface, and to create two different implementations for a
> read-only and a read-write segments.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)