jetty.xml used in default solr configuration, file downloaded by http and nginx maybe faster because in /etc/nginx/sites-available/default - location / { aio threads; sendfile on; }
I'm reproduce this in Ubuntu on virtual machines in home PC:
# 0 set /etc/sysctl.conf settings from https://darksideclouds.wordpress.com/2016/10/10/tuning-10gb-nics-highway-to-hell/
# and for network adapter ens160
sudo ip link set ens160 txqueuelen 10000
sudo ip link set ens160 mtu 9000
 
# 1 create 16Gb ram disk
sudo mkdir -p /mnt/solrramdisk
sudo mount -t tmpfs -o size=16G tmpfs /mnt/solrramdisk
 
# 2 download and start solr 
cd /mnt/solrramdisk
wget -O solr-8.11.2.tgz  https://www.apache.org/dyn/closer.lua/lucene/solr/8.11.2/solr-8.11.2.tgz?action="">
tar xzf solr-8.11.2.tgz 
./solr-8.11.2/bin/solr start
 
# 3 create 10G test file
head -c 10G /dev/urandom > /mnt/solrramdisk/solr-8.11.2/server/solr-webapp/webapp/sample.bin
 
# on other host with same mtu and /etc/sysctl.conf settings 
# 4 download test file from Solr Jetty 
wget --report-speed=bits -O /dev/null http://192.168.220.135:8983/solr/sample.bin
# Result - 2023-02-24 21:42:29 (4.83 Gb/s) - ‘/dev/null’ saved [10737418240]
 
# 5 download test file from nginx (in nginx set root /mnt/solrramdisk;)
wget --report-speed=bits -O /dev/null http://192.168.220.135/solr-8.11.2/server/solr-webapp/webapp/sample.bin
# Result - 2023-02-24 21:34:40 (7.28 Gb/s) - ‘/dev/null’ saved [10737418240/10737418240]
In virtual network with two virtual machines on the same host jetty faster (than 1.2 Gbit on my real hardware tests), but still slower Nginx and it doesn't explain why scrolling is significantly slower than Jetty's capabilities.
 
   Unfortunately, I am not a java developer and will not be able to offer a patch or rebuild the jar with a bigger buffer myself (without instructions on how to do it from scratch), but I can test/benchmark a new binary or any other Solr settings to find ways to speed up Solr Scroll.
   I'm hoping for a bigger buffer in FastWriter.java will speed up single thread data receiving, but this is a hypothesis that is better evaluated by specialists in the internal and network architecture of Solr. In my tests 1) Solr Jetty is able to give data much faster than they are given from collections. 2) Data from the collection is given at the same speed on servers that are quite different in performance (CPU cores are 30% slower, sas raid disk instead of ramdrive) with 10 Gbit network. 3) Multiple scroll does not increase the speed (disk and other Solr, OS caches do not help).
   Is there a mention somewhere that Solr, under any conditions, was able to send data from the collection faster than 400 Mbit in a single (single-threaded) connection? In which direction except FastWriter.java buffers can be searched for, what on modern fast hardware limits the data transfer rate during big collection scroll?

Reply via email to