Re: Low untunable default FastWriter output buffer - possible reason for slow single threaded data receiving from Solr on 1Gigabit+ networks while scroll, search etc

Fikavec F Mon, 20 Mar 2023 15:04:06 -0700

I tested "streaming expressions" ('expr=search(test_collection,q="*:*",fl="id, text_sn",sort="id asc",rows=16000000)') on collection with one shard with small documents - a long preparation of the server response before the data transfer begins (it looks like when the collection consisted of 8 shards), but then the iteration rate at a speed of 606 000 documents per second (by the way, unlike other methods, all processor cores are involved in the transfer, they all become 100% loaded), however taking into account the response preparation time, it turns out to be longer than /select handler (3m11.374s vs 2m15.507s by /select handler) and it is not possible to receive all 40000000 small documents for 1 request, since at some point the transmission is constantly interrupted:

in solr.log -
o.a.s.s.HttpSolrCall Unable to write response, client closed closed connection or we are shutting down => org.eclipse.jetty.io.EofException: Closed
at org.eclipse.jetty.server.HttpOutput.checkWritable(HttpOutput.java:771)
org.eclipse.jetty.io.EofException: Closed

I also tested /export handler on collection with one shard with small documents by replacing the field definition with <field name="text_sn" type="string" indexed="false" docValues="true" multiValued="false" stored="false" /> - his documents iteration rate is only 31 700 documents per second (extremely slow).

I can't figure out what the error is in my codecFactory, the log and code are posted here: https://github.com/Fikavec/LuceneCodecWithNoFieldCompression/blob/main/log/solr.log

If I could connect my codecFactory, it seems to me that based on the Zstandard codec code (https://github.com/apache/lucene/pull/439/files), I could make my codec without compression and test it on big and small stored fields, especially since it may be faster than SimpleTextCodecFactory.

Best Regards,

Re: Low untunable default FastWriter output buffer - possible reason for slow single threaded data receiving from Solr on 1Gigabit+ networks while scroll, search etc

Reply via email to