[
https://issues.apache.org/jira/browse/OAK-6922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tomek Rękawek resolved OAK-6922.
--------------------------------
Resolution: Fixed
Merged to trunk in [r1827292|https://svn.apache.org/r1827292].
> Azure support for the segment-tar
> ---------------------------------
>
> Key: OAK-6922
> URL: https://issues.apache.org/jira/browse/OAK-6922
> Project: Jackrabbit Oak
> Issue Type: New Feature
> Components: segment-tar
> Reporter: Tomek Rękawek
> Assignee: Tomek Rękawek
> Priority: Major
> Fix For: 1.9.0, 1.10
>
> Attachments: OAK-6922-2.patch, OAK-6922-3.patch, OAK-6922.patch
>
>
> An Azure Blob Storage implementation of the segment storage, based on the
> OAK-6921 work.
> h3. Segment files layout
> Thew new implementation doesn't use tar files. They are replaced with
> directories, storing segments, named after their UUIDs. This approach has
> following advantages:
> * no need to call seek(), which may be expensive on a remote file system.
> Rather than that we can read the whole file (=segment) at once.
> * it's possible to send multiple segments at once, asynchronously, which
> reduces the performance overhead (see below).
> The file structure is as follows:
> {noformat}
> [~]$ az storage blob list -c oak --output table
> Name Blob Type
> Blob Tier Length Content Type Last Modified
> ------------------------------------------------------------ -----------
> ----------- -------- ------------------------ -------------------------
> oak/data00000a.tar/0000.ca1326d1-edf4-4d53-aef0-0f14a6d05b63 BlockBlob
> 192 application/octet-stream 2018-01-31T10:59:14+00:00
> oak/data00000a.tar/0001.c6e03426-db9d-4315-a20a-12559e6aee54 BlockBlob
> 262144 application/octet-stream 2018-01-31T10:59:14+00:00
> oak/data00000a.tar/0002.b3784e27-6d16-4f80-afc1-6f3703f6bdb9 BlockBlob
> 262144 application/octet-stream 2018-01-31T10:59:14+00:00
> oak/data00000a.tar/0003.5d2f9588-0c92-4547-abf7-0263ee7c37bb BlockBlob
> 259216 application/octet-stream 2018-01-31T10:59:14+00:00
> ...
> oak/data00000a.tar/006e.7b8cf63d-849a-4120-aa7c-47c3dde25e48 BlockBlob
> 4368 application/octet-stream 2018-01-31T12:01:09+00:00
> oak/data00000a.tar/006f.93799ae9-288e-4b32-afc2-bbc676fad7e5 BlockBlob
> 3792 application/octet-stream 2018-01-31T12:01:14+00:00
> oak/data00000a.tar/0070.8b2d5ff2-6a74-4ac3-a3cc-cc439367c2aa BlockBlob
> 3680 application/octet-stream 2018-01-31T12:01:14+00:00
> oak/data00000a.tar/0071.2a1c49f0-ce33-4777-a042-8aa8a704d202 BlockBlob
> 7760 application/octet-stream 2018-01-31T12:10:54+00:00
> oak/journal.log.001 AppendBlob
> 1010 application/octet-stream 2018-01-31T12:10:54+00:00
> oak/manifest BlockBlob
> 46 application/octet-stream 2018-01-31T10:59:14+00:00
> oak/repo.lock BlockBlob
> application/octet-stream 2018-01-31T10:59:14+00:00
> {noformat}
> For the segment files, each name is prefixed with the index number. This
> allows to maintain an order, as in the tar archive. This order is normally
> stored in the index files as well, but if it's missing, the recovery process
> uses the prefixes to maintain it.
> Each file contains the raw segment data, with no padding/headers. Apart from
> the segment files, there are 3 special files: binary references (.brf),
> segment graph (.gph) and segment index (.idx).
> h3. Asynchronous writes
> Normally, all the TarWriter writes are synchronous, appending the segments to
> the tar file. In case of Azure Blob Storage each write involves a network
> latency. That's why the SegmentWriteQueue was introduced. The segments are
> added to the blocking dequeue, which is served by a number of the consumer
> threads, writing the segments to the cloud. There's also a map UUID->Segment,
> which allows to return the segments in case they are requested by the
> readSegment() method before they are actually persisted. Segments are removed
> from the map only after a successful write operation.
> The flush() method blocks accepting the new segments and returns after all
> waiting segments are written. The close() method waits until the current
> operations are finished and stops all threads.
> The asynchronous mode can be disabled by setting the number of threads to 0.
> h5. Queue recovery mode
> If the Azure Blob Storage write() operation fails, the segment will be
> re-added and the queue is switched to an "recovery mode". In this mode, all
> the threads are suspended and new segments are not accepted (active waiting).
> There's a single thread which retries adding the segment with some delay. If
> the segment is successfully written, the queue will back to the normal
> operation.
> This way the unavailable remote service is not flooded by the requests and
> we're not accepting the segments when we can't persist them.
> The close() method finishes the recovery mode - in this case, some of the
> awaiting segments won't be persisted.
> h5. Consistency
> The asynchronous mode isn't as reliable as the standard, synchronous case.
> Following cases are possible:
> * TarWriter#writeEntry() returns successfully, but the segments are not
> persisted.
> * TarWriter#writeEntry() accepts a number of segments: S1, S2, S3. The S2 and
> S3 are persisted, but the S1 is not.
> On the other hand:
> * If the TarWriter#flush() returns successfully, it means that all the
> accepted segments has been persisted.
> h5. Recovery
> During the segment recovery (eg. if the index file is missing), the Azure
> implementation checks if there's no missing segment in the middle. If so,
> only the consecutive segments are recovered. For instance, if we have S1, S2,
> S3, S5, S6, S7, then the recovery process will return only the first three.
> Patch:
> * [OAK-6922.patch|https://github.com/trekawek/jackrabbit-oak/pull/10.diff]
> * [GitHub diff|https://github.com/trekawek/jackrabbit-oak/pull/10/files]
> * [branch|https://github.com/trekawek/jackrabbit-oak/tree/OAK-6922]
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)