This is an automated email from the ASF dual-hosted git repository.

thomasm pushed a change to branch trunk
in repository https://gitbox.apache.org/repos/asf/jackrabbit-oak.git


    from 56366607a4 Merge pull request #995 from mreutegg/OAK-10313
     new cbfacdb90f Initial commit of pipelined download strategy.
     new c505124a35 Minor fixes
     new 03e3be8992 WIP
     new 58a36a1bd2 Merge remote-tracking branch 'upstream/trunk' into 
GRANITE-45911
     new f805fdab28 Add recovery from broken MongoDB connections to downloader.
     new 9bfb9fc54a Merge remote-tracking branch 'upstream/trunk' into 
GRANITE-45911
     new 45d0eaf291 Shutdown gracefully if one of the dump stages fails with an 
exception.
     new 9d1382ccb0 Add documentation. Change the configuration for the retrial 
mechanism for MongoDB connection failures: instead of number of retrials, use 
the amount of time to keep trying before giving up.
     new db824bd6dc Merge remote-tracking branch 'upstream/trunk' into OAK-10294
     new 832074fe64 Merge remote-tracking branch 'upstream/trunk' into OAK-10294
     new 7384304e73 Add support for auto-tuning working set memory based on 
total available memory to the indexer. Switch to using system properties to 
configure the pipelined strategy (instead of env variables).
     new 981e673e6c Always use read preference secondaryPreferred for 
downloading from Mongo.
     new 0e10939319 Address reviews comments.
     new 8633be3e60 Use 0 and Long.MAX_VALUE as boundaries for download range, 
no need to query Mongo to determine the earliest and latest values of _modified.
     new 5edda975aa Merge remote-tracking branch 'upstream/trunk' into OAK-10294
     new c0b80afe51 Add a new stage to merge sorted files. Log how much time 
the download and transform threads spend waiting to enqueue their outputs in 
the out queues (this indicates that the stages after them are too slow). 
Collect and log metrics describing the download and transform stages, like 
mongo documents downloaded, node state entries extracted, filtered, mongo 
documents that do not match a node state (garbage) and a few others. Use only 
FileUtils.byteCountToDisplaySize() to p [...]
     new ea7ce8a70f Merge remote-tracking branch 'upstream/trunk' into OAK-10294
     new 2b84a30d44 Improve collection and logging of metrics.
     new c68fad0dcc Add unit test for merge-sort task.
     new 93f9eacd85 Merge remote-tracking branch 'upstream/trunk' into OAK-10294
     new 8c5dc547e8 Add more tests, including a first draft of an integration 
test. Refactor code.
     new 787a4d575c Fix
     new 62697ca44e Merge remote-tracking branch 'upstream/trunk' into OAK-10294
     new 66d176bb90 Add an integration test
     new 6387d4dc5b Add license header.
     new a9dd47546f Add license header.
     new ab27625a83 Rename test files.
     new fabea754fd Do not filter on _modified != null when doing 
non-recoverable download.
     new 1cf44cf70c Merge remote-tracking branch 'upstream/trunk' into OAK-10294
     new a7f48a0370 Fixes based on review comments.
     new 5590997ded Bound the size of the histograms.
     new ab06697edb Add missing license files.
     new 2839cc9d45 Improve logging and code clean of bounded histogram.
     new ef2182e4aa Display human readable byte counts with 2 decimal places 
instead of abbreviating to closest integer value.
     new 27f9b710ef Improved logging.
     new 4f92e897d9 Refactoring
     new 4c33e932b4 Remove changes unrelated to PR.
     new ec2b12fd8e Merge remote-tracking branch 'upstream/trunk' into OAK-10294
     new a4b7bf1016 Add tests for error handling when configuration properties 
have invalid values. Minor refactoring.
     new 11f308c86f Add test for when the path predicate does not match any 
existing path in the document store. Address other review comments.
     new 81eaeead4a Update 
oak-run-commons/src/main/java/org/apache/jackrabbit/oak/index/indexer/document/flatfile/pipelined/PipelinedMergeSortTask.java
     new 560dd56d5d Update 
oak-run-commons/src/main/java/org/apache/jackrabbit/oak/index/indexer/document/flatfile/pipelined/PipelinedMongoDownloadTask.java
     new 00923a38c0 Merge remote-tracking branch 'origin/OAK-10294' into 
OAK-10294
     new 074e4af804 Add unit test to test recovery from broken mongo 
connections. Fix: when a connection to Mongo is lost, the documents that were 
collected in a block but not yet enqueued were being lost.
     new 70e6b89a86 Add missing license.
     new 25c01b8176 Merge pull request #979 from nfsantos/OAK-10294

The 18643 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 .../org/apache/jackrabbit/oak/commons/IOUtils.java |  21 +
 .../jackrabbit/oak/commons/sort/ExternalSort.java  |   4 +-
 oak-run-commons/pom.xml                            |   7 +
 .../indexer/document/DocumentStoreIndexerBase.java |  63 +--
 .../indexer/document/NodeStateEntryTraverser.java  |   7 +-
 .../document/NodeStateEntryTraverserFactory.java   |   5 +-
 .../flatfile/FlatFileNodeStoreBuilder.java         |  39 +-
 .../document/flatfile/FlatFileStoreUtils.java      |   8 +-
 .../MultithreadedTraverseWithSortStrategy.java     |  10 +-
 .../document/flatfile/NodeStateEntryWriter.java    |  26 +-
 .../document/flatfile/StoreAndSortStrategy.java    |  11 +-
 .../document/flatfile/TraverseAndSortTask.java     |  10 +-
 .../flatfile/TraverseWithSortStrategy.java         |  19 +-
 .../flatfile/pipelined/BoundedHistogram.java       |  89 +++++
 .../document/flatfile/pipelined/ConfigHelper.java} |  44 +--
 .../document/flatfile/pipelined/DownloadRange.java |  74 ++++
 .../flatfile/pipelined/NodeStateEntryBatch.java    |  96 +++++
 .../NodeStateHolder.java}                          |  39 +-
 .../{ => pipelined}/PathElementComparator.java     |  32 +-
 .../flatfile/pipelined/PipelinedMergeSortTask.java | 140 +++++++
 .../pipelined/PipelinedMongoDownloadTask.java      | 327 ++++++++++++++++
 .../flatfile/pipelined/PipelinedSortBatchTask.java | 155 ++++++++
 .../flatfile/pipelined/PipelinedStrategy.java      | 433 +++++++++++++++++++++
 .../flatfile/pipelined/PipelinedTransformTask.java | 246 ++++++++++++
 .../document/flatfile/pipelined/SortKey.java       |  89 +++++
 .../pipelined/TransformStageStatistics.java        | 171 ++++++++
 .../document/mongo/MongoDocumentStoreHelper.java   |   9 +
 .../document/mongo/MongoDocumentTraverser.java     |  65 +---
 .../plugins/document/mongo/TraversingRange.java    |  81 ++++
 .../document/flatfile/FlatFileStoreTest.java       |   8 +-
 .../MultithreadedTraverseWithSortStrategyTest.java |   4 +-
 .../document/flatfile/TraverseAndSortTaskTest.java |   2 +-
 .../flatfile/pipelined/BoundedHistogramTest.java   |  56 +++
 .../pipelined/NodeStateEntryBatchTest.java         |  88 +++++
 .../document/flatfile/pipelined/PipelinedIT.java   | 213 ++++++++++
 .../pipelined/PipelinedMergeSortTaskTest.java      | 120 ++++++
 .../pipelined/PipelinedMongoDownloadTaskTest.java  | 104 +++++
 .../pipelined/PipelinedSortBatchTaskTest.java      | 186 +++++++++
 .../test/resources/pipelined/merge-expected.json   |   6 +
 .../test/resources/pipelined/merge-stage-1.json    |   3 +
 .../test/resources/pipelined/merge-stage-2.json    |   3 +
 .../document/mongo/DocumentTraverserTest.java      |   2 +-
 42 files changed, 2903 insertions(+), 212 deletions(-)
 create mode 100644 
oak-run-commons/src/main/java/org/apache/jackrabbit/oak/index/indexer/document/flatfile/pipelined/BoundedHistogram.java
 copy 
oak-run-commons/src/{test/java/org/apache/jackrabbit/oak/index/indexer/document/flatfile/CountingIterable.java
 => 
main/java/org/apache/jackrabbit/oak/index/indexer/document/flatfile/pipelined/ConfigHelper.java}
 (53%)
 create mode 100644 
oak-run-commons/src/main/java/org/apache/jackrabbit/oak/index/indexer/document/flatfile/pipelined/DownloadRange.java
 create mode 100644 
oak-run-commons/src/main/java/org/apache/jackrabbit/oak/index/indexer/document/flatfile/pipelined/NodeStateEntryBatch.java
 copy 
oak-run-commons/src/main/java/org/apache/jackrabbit/oak/index/indexer/document/flatfile/{SimpleNodeStateHolder.java
 => pipelined/NodeStateHolder.java} (55%)
 copy 
oak-run-commons/src/main/java/org/apache/jackrabbit/oak/index/indexer/document/flatfile/{
 => pipelined}/PathElementComparator.java (73%)
 create mode 100644 
oak-run-commons/src/main/java/org/apache/jackrabbit/oak/index/indexer/document/flatfile/pipelined/PipelinedMergeSortTask.java
 create mode 100644 
oak-run-commons/src/main/java/org/apache/jackrabbit/oak/index/indexer/document/flatfile/pipelined/PipelinedMongoDownloadTask.java
 create mode 100644 
oak-run-commons/src/main/java/org/apache/jackrabbit/oak/index/indexer/document/flatfile/pipelined/PipelinedSortBatchTask.java
 create mode 100644 
oak-run-commons/src/main/java/org/apache/jackrabbit/oak/index/indexer/document/flatfile/pipelined/PipelinedStrategy.java
 create mode 100644 
oak-run-commons/src/main/java/org/apache/jackrabbit/oak/index/indexer/document/flatfile/pipelined/PipelinedTransformTask.java
 create mode 100644 
oak-run-commons/src/main/java/org/apache/jackrabbit/oak/index/indexer/document/flatfile/pipelined/SortKey.java
 create mode 100644 
oak-run-commons/src/main/java/org/apache/jackrabbit/oak/index/indexer/document/flatfile/pipelined/TransformStageStatistics.java
 create mode 100644 
oak-run-commons/src/main/java/org/apache/jackrabbit/oak/plugins/document/mongo/TraversingRange.java
 create mode 100644 
oak-run-commons/src/test/java/org/apache/jackrabbit/oak/index/indexer/document/flatfile/pipelined/BoundedHistogramTest.java
 create mode 100644 
oak-run-commons/src/test/java/org/apache/jackrabbit/oak/index/indexer/document/flatfile/pipelined/NodeStateEntryBatchTest.java
 create mode 100644 
oak-run-commons/src/test/java/org/apache/jackrabbit/oak/index/indexer/document/flatfile/pipelined/PipelinedIT.java
 create mode 100644 
oak-run-commons/src/test/java/org/apache/jackrabbit/oak/index/indexer/document/flatfile/pipelined/PipelinedMergeSortTaskTest.java
 create mode 100644 
oak-run-commons/src/test/java/org/apache/jackrabbit/oak/index/indexer/document/flatfile/pipelined/PipelinedMongoDownloadTaskTest.java
 create mode 100644 
oak-run-commons/src/test/java/org/apache/jackrabbit/oak/index/indexer/document/flatfile/pipelined/PipelinedSortBatchTaskTest.java
 create mode 100644 
oak-run-commons/src/test/resources/pipelined/merge-expected.json
 create mode 100644 
oak-run-commons/src/test/resources/pipelined/merge-stage-1.json
 create mode 100644 
oak-run-commons/src/test/resources/pipelined/merge-stage-2.json

Reply via email to