This is an automated email from the ASF dual-hosted git repository.
thomasm pushed a change to branch trunk
in repository https://gitbox.apache.org/repos/asf/jackrabbit-oak.git
from 56366607a4 Merge pull request #995 from mreutegg/OAK-10313
new cbfacdb90f Initial commit of pipelined download strategy.
new c505124a35 Minor fixes
new 03e3be8992 WIP
new 58a36a1bd2 Merge remote-tracking branch 'upstream/trunk' into
GRANITE-45911
new f805fdab28 Add recovery from broken MongoDB connections to downloader.
new 9bfb9fc54a Merge remote-tracking branch 'upstream/trunk' into
GRANITE-45911
new 45d0eaf291 Shutdown gracefully if one of the dump stages fails with an
exception.
new 9d1382ccb0 Add documentation. Change the configuration for the retrial
mechanism for MongoDB connection failures: instead of number of retrials, use
the amount of time to keep trying before giving up.
new db824bd6dc Merge remote-tracking branch 'upstream/trunk' into OAK-10294
new 832074fe64 Merge remote-tracking branch 'upstream/trunk' into OAK-10294
new 7384304e73 Add support for auto-tuning working set memory based on
total available memory to the indexer. Switch to using system properties to
configure the pipelined strategy (instead of env variables).
new 981e673e6c Always use read preference secondaryPreferred for
downloading from Mongo.
new 0e10939319 Address reviews comments.
new 8633be3e60 Use 0 and Long.MAX_VALUE as boundaries for download range,
no need to query Mongo to determine the earliest and latest values of _modified.
new 5edda975aa Merge remote-tracking branch 'upstream/trunk' into OAK-10294
new c0b80afe51 Add a new stage to merge sorted files. Log how much time
the download and transform threads spend waiting to enqueue their outputs in
the out queues (this indicates that the stages after them are too slow).
Collect and log metrics describing the download and transform stages, like
mongo documents downloaded, node state entries extracted, filtered, mongo
documents that do not match a node state (garbage) and a few others. Use only
FileUtils.byteCountToDisplaySize() to p [...]
new ea7ce8a70f Merge remote-tracking branch 'upstream/trunk' into OAK-10294
new 2b84a30d44 Improve collection and logging of metrics.
new c68fad0dcc Add unit test for merge-sort task.
new 93f9eacd85 Merge remote-tracking branch 'upstream/trunk' into OAK-10294
new 8c5dc547e8 Add more tests, including a first draft of an integration
test. Refactor code.
new 787a4d575c Fix
new 62697ca44e Merge remote-tracking branch 'upstream/trunk' into OAK-10294
new 66d176bb90 Add an integration test
new 6387d4dc5b Add license header.
new a9dd47546f Add license header.
new ab27625a83 Rename test files.
new fabea754fd Do not filter on _modified != null when doing
non-recoverable download.
new 1cf44cf70c Merge remote-tracking branch 'upstream/trunk' into OAK-10294
new a7f48a0370 Fixes based on review comments.
new 5590997ded Bound the size of the histograms.
new ab06697edb Add missing license files.
new 2839cc9d45 Improve logging and code clean of bounded histogram.
new ef2182e4aa Display human readable byte counts with 2 decimal places
instead of abbreviating to closest integer value.
new 27f9b710ef Improved logging.
new 4f92e897d9 Refactoring
new 4c33e932b4 Remove changes unrelated to PR.
new ec2b12fd8e Merge remote-tracking branch 'upstream/trunk' into OAK-10294
new a4b7bf1016 Add tests for error handling when configuration properties
have invalid values. Minor refactoring.
new 11f308c86f Add test for when the path predicate does not match any
existing path in the document store. Address other review comments.
new 81eaeead4a Update
oak-run-commons/src/main/java/org/apache/jackrabbit/oak/index/indexer/document/flatfile/pipelined/PipelinedMergeSortTask.java
new 560dd56d5d Update
oak-run-commons/src/main/java/org/apache/jackrabbit/oak/index/indexer/document/flatfile/pipelined/PipelinedMongoDownloadTask.java
new 00923a38c0 Merge remote-tracking branch 'origin/OAK-10294' into
OAK-10294
new 074e4af804 Add unit test to test recovery from broken mongo
connections. Fix: when a connection to Mongo is lost, the documents that were
collected in a block but not yet enqueued were being lost.
new 70e6b89a86 Add missing license.
new 25c01b8176 Merge pull request #979 from nfsantos/OAK-10294
The 18643 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails. The revisions
listed as "add" were already present in the repository and have only
been added to this reference.
Summary of changes:
.../org/apache/jackrabbit/oak/commons/IOUtils.java | 21 +
.../jackrabbit/oak/commons/sort/ExternalSort.java | 4 +-
oak-run-commons/pom.xml | 7 +
.../indexer/document/DocumentStoreIndexerBase.java | 63 +--
.../indexer/document/NodeStateEntryTraverser.java | 7 +-
.../document/NodeStateEntryTraverserFactory.java | 5 +-
.../flatfile/FlatFileNodeStoreBuilder.java | 39 +-
.../document/flatfile/FlatFileStoreUtils.java | 8 +-
.../MultithreadedTraverseWithSortStrategy.java | 10 +-
.../document/flatfile/NodeStateEntryWriter.java | 26 +-
.../document/flatfile/StoreAndSortStrategy.java | 11 +-
.../document/flatfile/TraverseAndSortTask.java | 10 +-
.../flatfile/TraverseWithSortStrategy.java | 19 +-
.../flatfile/pipelined/BoundedHistogram.java | 89 +++++
.../document/flatfile/pipelined/ConfigHelper.java} | 44 +--
.../document/flatfile/pipelined/DownloadRange.java | 74 ++++
.../flatfile/pipelined/NodeStateEntryBatch.java | 96 +++++
.../NodeStateHolder.java} | 39 +-
.../{ => pipelined}/PathElementComparator.java | 32 +-
.../flatfile/pipelined/PipelinedMergeSortTask.java | 140 +++++++
.../pipelined/PipelinedMongoDownloadTask.java | 327 ++++++++++++++++
.../flatfile/pipelined/PipelinedSortBatchTask.java | 155 ++++++++
.../flatfile/pipelined/PipelinedStrategy.java | 433 +++++++++++++++++++++
.../flatfile/pipelined/PipelinedTransformTask.java | 246 ++++++++++++
.../document/flatfile/pipelined/SortKey.java | 89 +++++
.../pipelined/TransformStageStatistics.java | 171 ++++++++
.../document/mongo/MongoDocumentStoreHelper.java | 9 +
.../document/mongo/MongoDocumentTraverser.java | 65 +---
.../plugins/document/mongo/TraversingRange.java | 81 ++++
.../document/flatfile/FlatFileStoreTest.java | 8 +-
.../MultithreadedTraverseWithSortStrategyTest.java | 4 +-
.../document/flatfile/TraverseAndSortTaskTest.java | 2 +-
.../flatfile/pipelined/BoundedHistogramTest.java | 56 +++
.../pipelined/NodeStateEntryBatchTest.java | 88 +++++
.../document/flatfile/pipelined/PipelinedIT.java | 213 ++++++++++
.../pipelined/PipelinedMergeSortTaskTest.java | 120 ++++++
.../pipelined/PipelinedMongoDownloadTaskTest.java | 104 +++++
.../pipelined/PipelinedSortBatchTaskTest.java | 186 +++++++++
.../test/resources/pipelined/merge-expected.json | 6 +
.../test/resources/pipelined/merge-stage-1.json | 3 +
.../test/resources/pipelined/merge-stage-2.json | 3 +
.../document/mongo/DocumentTraverserTest.java | 2 +-
42 files changed, 2903 insertions(+), 212 deletions(-)
create mode 100644
oak-run-commons/src/main/java/org/apache/jackrabbit/oak/index/indexer/document/flatfile/pipelined/BoundedHistogram.java
copy
oak-run-commons/src/{test/java/org/apache/jackrabbit/oak/index/indexer/document/flatfile/CountingIterable.java
=>
main/java/org/apache/jackrabbit/oak/index/indexer/document/flatfile/pipelined/ConfigHelper.java}
(53%)
create mode 100644
oak-run-commons/src/main/java/org/apache/jackrabbit/oak/index/indexer/document/flatfile/pipelined/DownloadRange.java
create mode 100644
oak-run-commons/src/main/java/org/apache/jackrabbit/oak/index/indexer/document/flatfile/pipelined/NodeStateEntryBatch.java
copy
oak-run-commons/src/main/java/org/apache/jackrabbit/oak/index/indexer/document/flatfile/{SimpleNodeStateHolder.java
=> pipelined/NodeStateHolder.java} (55%)
copy
oak-run-commons/src/main/java/org/apache/jackrabbit/oak/index/indexer/document/flatfile/{
=> pipelined}/PathElementComparator.java (73%)
create mode 100644
oak-run-commons/src/main/java/org/apache/jackrabbit/oak/index/indexer/document/flatfile/pipelined/PipelinedMergeSortTask.java
create mode 100644
oak-run-commons/src/main/java/org/apache/jackrabbit/oak/index/indexer/document/flatfile/pipelined/PipelinedMongoDownloadTask.java
create mode 100644
oak-run-commons/src/main/java/org/apache/jackrabbit/oak/index/indexer/document/flatfile/pipelined/PipelinedSortBatchTask.java
create mode 100644
oak-run-commons/src/main/java/org/apache/jackrabbit/oak/index/indexer/document/flatfile/pipelined/PipelinedStrategy.java
create mode 100644
oak-run-commons/src/main/java/org/apache/jackrabbit/oak/index/indexer/document/flatfile/pipelined/PipelinedTransformTask.java
create mode 100644
oak-run-commons/src/main/java/org/apache/jackrabbit/oak/index/indexer/document/flatfile/pipelined/SortKey.java
create mode 100644
oak-run-commons/src/main/java/org/apache/jackrabbit/oak/index/indexer/document/flatfile/pipelined/TransformStageStatistics.java
create mode 100644
oak-run-commons/src/main/java/org/apache/jackrabbit/oak/plugins/document/mongo/TraversingRange.java
create mode 100644
oak-run-commons/src/test/java/org/apache/jackrabbit/oak/index/indexer/document/flatfile/pipelined/BoundedHistogramTest.java
create mode 100644
oak-run-commons/src/test/java/org/apache/jackrabbit/oak/index/indexer/document/flatfile/pipelined/NodeStateEntryBatchTest.java
create mode 100644
oak-run-commons/src/test/java/org/apache/jackrabbit/oak/index/indexer/document/flatfile/pipelined/PipelinedIT.java
create mode 100644
oak-run-commons/src/test/java/org/apache/jackrabbit/oak/index/indexer/document/flatfile/pipelined/PipelinedMergeSortTaskTest.java
create mode 100644
oak-run-commons/src/test/java/org/apache/jackrabbit/oak/index/indexer/document/flatfile/pipelined/PipelinedMongoDownloadTaskTest.java
create mode 100644
oak-run-commons/src/test/java/org/apache/jackrabbit/oak/index/indexer/document/flatfile/pipelined/PipelinedSortBatchTaskTest.java
create mode 100644
oak-run-commons/src/test/resources/pipelined/merge-expected.json
create mode 100644
oak-run-commons/src/test/resources/pipelined/merge-stage-1.json
create mode 100644
oak-run-commons/src/test/resources/pipelined/merge-stage-2.json