[
https://issues.apache.org/jira/browse/LUCENE-7583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15725830#comment-15725830
]
Michael McCandless commented on LUCENE-7583:
--------------------------------------------
bq. Are we sure that we do not open the IndexOutput in one thread and had it
over to another one?
Yeah, the {{IndexOutput}} is opened in {{Lucene60PointsWriter}}, and then that
same thread goes and writes all points via {{writeField}}. At IW flush time
it's an indexing thread, and at merge time it's a merge thread, but it should
only ever be a single thread touching that {{IndexOutput}}. The benchmark I'm
running only ever uses a single thread anyway ...
bq. we should also make all references to the IndexOutput private, so it cannot
escape the current thread (to help hotspot). This means: no non-private fields
holding the reference to the stream.
I'll try to do this; there's at least one place where it's protected, but
that's way high up in the stack ({{Lucene60PointsWriter}}).
bq. If we are really required to fork the buffered stream, we may use:
https://github.com/apache/lucene-solr/blob/master/solr/solrj/src/java/org/apache/solr/common/util/FastOutputStream.java
(but without the DataOutput interface impl).
I'll test that too.
Thanks [~thetaphi].
> Can we improve OutputStreamIndexOutput's byte buffering when writing each BKD
> leaf block?
> -----------------------------------------------------------------------------------------
>
> Key: LUCENE-7583
> URL: https://issues.apache.org/jira/browse/LUCENE-7583
> Project: Lucene - Core
> Issue Type: Improvement
> Reporter: Michael McCandless
> Fix For: master (7.0), 6.4
>
> Attachments: LUCENE-7583-hardcode-writeVInt.patch, LUCENE-7583.patch
>
>
> When BKD writes its leaf blocks, it's essentially a lot of tiny writes (vint,
> int, short, etc.), and I've seen deep thread stacks through our IndexOutput
> impl ({{OutputStreamIndexOutput}}) when pulling hot threads while BKD is
> writing.
> So I tried a small change, to have BKDWriter do its own buffering, by first
> writing each leaf block into a {{RAMOutputStream}}, and then dumping that (in
> 1 KB byte[] chunks) to the actual IndexOutput.
> This gives a non-trivial reduction (~6%) in the total time for BKD writing +
> merging time on the 20M NYC taxis nightly benchmark (2 times each):
> Trunk, sparse:
> - total: 64.691 sec
> - total: 64.702 sec
> Patch, sparse:
> - total: 60.820 sec
> - total: 60.965 sec
> Trunk dense:
> - total: 62.730 sec
> - total: 62.383 sec
> Patch dense:
> - total: 58.805 sec
> - total: 58.742 sec
> The results seem to be consistent and reproducible. I'm using Java 1.8.0_101
> on a fast SSD on Ubuntu 16.04.
> It's sort of weird and annoying that this helps so much, because
> {{OutputStreamIndexOutput}} already uses java's {{BufferedOutputStream}}
> (default 8 KB buffer) to buffer writes.
> [~thetaphi] suggested maybe hotspot is failing to inline/optimize the
> {{writeByte}} / the call stack just has too many layers.
> We could commit this patch (it's trivial) but it'd be nice to understand and
> fix why buffering writes is somehow costly so any other Lucene codec
> components that write lots of little things can be improved too.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]