On Sun, Mar 19, 2023 at 8:39 AM Fikavec F <fika...@yandex.ru> wrote:

>    I was able to create a collection with "solr.SimpleTextCodecFactory"
> codecFactory and solr can proces (return) only 2x more documents per second
> from it (214 410 documents per second vs 115 000 "solr.SchemaCodecFactory"
> with compression).
>
I expected much much more, because this is a simple iteration and sending
> small fields to the output. Is this enough to make sure that the Solr limit
> of processing 115 000 documents per second is not due only  to compression,
> but something else? Or is the speed of SimpleTextCodecFactory in this case
> not an indicator for correct testing and yet it is necessary to create my
> own codecFactory class without compression?
>

SimpleTextCodecFactory is more for demonstration and clear-text readability
of the data on disk.  For educational purposes; it's a "toy" of sorts.  So
It's impressive you doubled the performance of your use-case with it.


> I also tried to create a collection with a standard codec of 8 shards for
> the test, the  documents iteration rate is the same about 115 000 small
> documents per second.
>

Interesting.  This suggests to me the aggregation node is limiting things.
This is implemented by SearchHandler, and I don't imagine any simple
changes to make it operate differently.

I think Solr's "streaming expressions" capability (really a set of
capabilities) is much closer to this use-case but I looked around and I
think you'll probably have the same limitation with the "select"
expression.  I was hoping it would send a "/select" to each shard with
distrib=false to bypass SearchHandler's distributed search but no.  I could
imagine improvements there.


> P.S. As a <codecFactory class="my.Lucene87CodecWithNoFieldCompression"/>
> in solrconfig.xml, I currently can't connect even a simple codec layer:
>
> package my;
> import org.apache.lucene.codecs.FilterCodec;
> import org.apache.lucene.codecs.StoredFieldsFormat;
> import org.apache.lucene.codecs.lucene87.Lucene87Codec;
> import org.apache.lucene.codecs.lucene87.Lucene87StoredFieldsFormat;
>
> public final class Lucene87CodecWithNoFieldCompression extends FilterCodec
> {
>     private final StoredFieldsFormat storedFieldsFormat;
>
>     public Lucene87CodecWithNoFieldCompression() {
>         super("Lucene87CodecWithNoFieldCompression", new Lucene87Codec());
>         storedFieldsFormat = new Lucene87StoredFieldsFormat();
>     }
>     @Override
>     public StoredFieldsFormat storedFieldsFormat() {
>         return storedFieldsFormat;
>     }
>     @Override
>     public String toString() {
>       return getClass().getSimpleName();
>     }
> }
>
> At a glance it looks good.  What error do you get?

~ David

Reply via email to