On Sun, Mar 19, 2023 at 8:39 AM Fikavec F <fika...@yandex.ru> wrote:
> I was able to create a collection with "solr.SimpleTextCodecFactory" > codecFactory and solr can proces (return) only 2x more documents per second > from it (214 410 documents per second vs 115 000 "solr.SchemaCodecFactory" > with compression). > I expected much much more, because this is a simple iteration and sending > small fields to the output. Is this enough to make sure that the Solr limit > of processing 115 000 documents per second is not due only to compression, > but something else? Or is the speed of SimpleTextCodecFactory in this case > not an indicator for correct testing and yet it is necessary to create my > own codecFactory class without compression? > SimpleTextCodecFactory is more for demonstration and clear-text readability of the data on disk. For educational purposes; it's a "toy" of sorts. So It's impressive you doubled the performance of your use-case with it. > I also tried to create a collection with a standard codec of 8 shards for > the test, the documents iteration rate is the same about 115 000 small > documents per second. > Interesting. This suggests to me the aggregation node is limiting things. This is implemented by SearchHandler, and I don't imagine any simple changes to make it operate differently. I think Solr's "streaming expressions" capability (really a set of capabilities) is much closer to this use-case but I looked around and I think you'll probably have the same limitation with the "select" expression. I was hoping it would send a "/select" to each shard with distrib=false to bypass SearchHandler's distributed search but no. I could imagine improvements there. > P.S. As a <codecFactory class="my.Lucene87CodecWithNoFieldCompression"/> > in solrconfig.xml, I currently can't connect even a simple codec layer: > > package my; > import org.apache.lucene.codecs.FilterCodec; > import org.apache.lucene.codecs.StoredFieldsFormat; > import org.apache.lucene.codecs.lucene87.Lucene87Codec; > import org.apache.lucene.codecs.lucene87.Lucene87StoredFieldsFormat; > > public final class Lucene87CodecWithNoFieldCompression extends FilterCodec > { > private final StoredFieldsFormat storedFieldsFormat; > > public Lucene87CodecWithNoFieldCompression() { > super("Lucene87CodecWithNoFieldCompression", new Lucene87Codec()); > storedFieldsFormat = new Lucene87StoredFieldsFormat(); > } > @Override > public StoredFieldsFormat storedFieldsFormat() { > return storedFieldsFormat; > } > @Override > public String toString() { > return getClass().getSimpleName(); > } > } > > At a glance it looks good. What error do you get? ~ David