Re: Solr 8.5.1 startup error - lengthTag=109, too big.

2020-05-27 Thread Zheng Lin Edwin Yeo
Hi Mike,

Thanks for your reply.

Yes, I have SSL enabled in 8.2.1 as well. The error is there even it I use
the same certificate for 8.2.1, which was working fine there.
I have also generated the certificate for both 8.2.1 and 8.5.1 by the same
method.

Is there any changes between these 2 versions that would have affected
this? (Eg: there are changes in the way we generate the certificate)

Regards,
Edwin

On Wed, 27 May 2020 at 04:23, Mike Drob  wrote:

> Did you have SSL enabled with 8.2.1?
>
> The error looks common to certificate handling and not specific to Solr.
>
> I would verify that you have no extra characters in your certificate file
> (including line endings) and that the keystore type that you specified
> matches the file you are presenting (JKS or PKCS12)
>
> Mike
>
> On Sat, May 23, 2020 at 10:11 PM Zheng Lin Edwin Yeo  >
> wrote:
>
> > Hi,
> >
> > I'm trying to upgrade from Solr 8.2.1 to Solr 8.5.1, with Solr SSL
> > Authentication and Authorization.
> >
> > However, I get the following error when I enable SSL. The Solr itself can
> > start up if there is no SSL.  The main error that I see is this
> >
> >   java.io.IOException: DerInputStream.getLength(): lengthTag=109, too
> big.
> >
> > What could be the reason that causes this?
> >
> >
> > INFO  - 2020-05-24 10:38:20.080;
> > org.apache.solr.util.configuration.SSLConfigurations; Setting
> > javax.net.ssl.keyStorePassword
> > INFO  - 2020-05-24 10:38:20.081;
> > org.apache.solr.util.configuration.SSLConfigurations; Setting
> > javax.net.ssl.trustStorePassword
> > Waiting up to 120 to see Solr running on port 8983
> > java.lang.reflect.InvocationTargetException
> > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> > at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
> > at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown
> Source)
> > at java.lang.reflect.Method.invoke(Unknown Source)
> > at org.eclipse.jetty.start.Main.invokeMain(Main.java:218)
> > at org.eclipse.jetty.start.Main.start(Main.java:491)
> > at org.eclipse.jetty.start.Main.main(Main.java:77)d
> > Caused by: java.security.PrivilegedActionException: java.io.IOException:
> > DerInputStream.getLength(): lengthTag=109, too big.
> > at java.security.AccessController.doPrivileged(Native Method)
> > at
> > org.eclipse.jetty.xml.XmlConfiguration.main(XmlConfiguration.java:1837)
> > ... 7 more
> > Caused by: java.io.IOException: DerInputStream.getLength():
> lengthTag=109,
> > too big.
> > at sun.security.util.DerInputStream.getLength(Unknown Source)
> > at sun.security.util.DerValue.init(Unknown Source)
> > at sun.security.util.DerValue.(Unknown Source)
> > at sun.security.util.DerValue.(Unknown Source)
> > at sun.security.pkcs12.PKCS12KeyStore.engineLoad(Unknown Source)
> > at java.security.KeyStore.load(Unknown Source)
> > at
> >
> >
> org.eclipse.jetty.util.security.CertificateUtils.getKeyStore(CertificateUtils.java:54)
> > at
> >
> >
> org.eclipse.jetty.util.ssl.SslContextFactory.loadKeyStore(SslContextFactory.java:1188)
> > at
> >
> >
> org.eclipse.jetty.util.ssl.SslContextFactory.load(SslContextFactory.java:323)
> > at
> >
> >
> org.eclipse.jetty.util.ssl.SslContextFactory.doStart(SslContextFactory.java:245)
> > at
> >
> >
> org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:72)
> > at
> >
> >
> org.eclipse.jetty.util.component.ContainerLifeCycle.start(ContainerLifeCycle.java:169)
> > at
> >
> >
> org.eclipse.jetty.util.component.ContainerLifeCycle.doStart(ContainerLifeCycle.java:117)
> > at
> >
> >
> org.eclipse.jetty.server.SslConnectionFactory.doStart(SslConnectionFactory.java:92)
> > at
> >
> >
> org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:72)
> > at
> >
> >
> org.eclipse.jetty.util.component.ContainerLifeCycle.start(ContainerLifeCycle.java:169)
> > at
> >
> >
> org.eclipse.jetty.util.component.ContainerLifeCycle.doStart(ContainerLifeCycle.java:117)
> > at
> >
> >
> org.eclipse.jetty.server.AbstractConnector.doStart(AbstractConnector.java:320)
> > at
> >
> >
> org.eclipse.jetty.server.AbstractNetworkConnector.doStart(AbstractNetworkConnector.java:81)
> > at
> >
> org.eclipse.jetty.server.ServerConnector.doStart(ServerConnector.java:231)
> > at
> >
> >
> org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:72)
> > at org.eclipse.jetty.server.Server.doStart(Server.java:385)
> > at
> >
> >
> org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:72)
> > at
> >
> >
> org.eclipse.jetty.xml.XmlConfiguration.lambda$main$0(XmlConfiguration.java:1888)
> > ... 9 more
> > java.lang.reflect.InvocationTargetException
> > at 

Re: TimestampUpdateProcessorFactory updates the field even if the value if present

2020-05-27 Thread Erick Erickson
When is “NOW” ;) ?. The process for updating a doc in SolrCloud is:

1> the doc is received by some solr node.

2> the doc is forwarded to the shard leader if necessary.

3> the doc is distributed from the shard leader to all replicas of that shard.

4> the doc is indexed on each replica.

So just using NOW as the default value, the timestamp would be assigned in
step <4> and would almost certainly be different on the different replicas of
the single shard for any number of reasons from the servers not being
exactly in sync to propagation delays to replica N happening to hit a GC pause
to….

The update processor factory assigns the timestamp once on the leader so
it’s the same on all copies of the doc, assuming it is in the chain in before
DistributedUpdateProcessorFactory.

So with a single-replica (leader only) setup,  or non-cloud setups, the two
would produce near enough to identical results. But if there are multiple 
replicas
you have to use the factory.

Hmm, I suppose if you are using TLOG/PULL replicas it wouldn’t matter which
approach you used insofar as the doc on each replica would have the
same timestamp.

Best,
Erick

> On May 27, 2020, at 3:49 PM, gnandre  wrote:
> 
> Thanks for the detailed response, Chris. I am aware of the partial (atomic)
> updates. Thanks for clarifying the confusion about input document vs
> indexed document. I was thinking that TimestampUpdateProcessorFactory
> checks if the value exists in the field inside indexed document before
> updating it but actually it does check if it present inside the input
> request. But the why do we require explicit processor for that? This can be
> done with a simple field in schema that has default value as NOW.
> 
> I tried your idea about MinFieldValueUpdateProcessorFactory but it does not
> work. Here is the configuration:
> 
> 
>   "fieldName">index_time_stamp_create   "solr.LogUpdateProcessorFactory" />  "solr.DistributedUpdateProcessorFactory" />  "solr.MinFieldValueUpdateProcessorFactory"> 
> index_time_stamp_create   "solr.RunUpdateProcessorFactory" /> 
> 
> I think MinFieldValueUpdateProcessorFactory keeps the min value in a
> multivalued field which  index_time_stamp_create is not.
> 
> On Tue, May 26, 2020 at 2:31 PM Chris Hostetter 
> wrote:
> 
>> : Subject: TimestampUpdateProcessorFactory updates the field even if the
>> value
>> : if present
>> :
>> : Hi,
>> :
>> : Following is the update request processor chain.
>> :
>> : >> <
>> : processor class="solr.TimestampUpdateProcessorFactory"> > : "fieldName">index_time_stamp_create  > : "solr.LogUpdateProcessorFactory" /> > : "solr.RunUpdateProcessorFactory" /> 
>> :
>> : And, here is how the field is defined in schema.xml
>> :
>> : > : "true" />
>> :
>> : Every time I index the same document, above field changes its value with
>> : latest timestamp. According to TimestampUpdateProcessorFactory  javadoc
>> : page, if a document does not contain a value in the timestamp field, a
>> new
>> 
>> based on the wording of your question, i suspect you are confused about
>> the overall behavior of how "updating" an existing document works in solr,
>> and how update processors "see" an *input document* when processing an
>> add/update command.
>> 
>> 
>> First off, completley ignoring TimestampUpdateProcessorFactory and
>> assuming just the simplest possibel update change, let's clarify how
>> "updates" work, let's assume you when you say you "index the same
>> document" twice you do so with a few diff field values ...
>> 
>> First Time...
>> 
>> {  id:"x",  title:"" }
>> 
>> Second time...
>> 
>> {  id:"x",  body:"      xxx" }
>> 
>> Solr does not implicitly know that you are trying to *update* that
>> document, the final result will not be a document containing both a
>> "title" field and "body" field in addition to the "id", it will *only*
>> have the "id" and "body" fields and the title field will be lost.
>> 
>> The way to "update" a document *and keep existing field values* is with
>> one of the "Atomic Update" command options...
>> 
>> 
>> https://lucene.apache.org/solr/guide/8_4/updating-parts-of-documents.html#UpdatingPartsofDocuments-AtomicUpdates
>> 
>> {  id:"x",  title:"" }
>> 
>> Second time...
>> 
>> {  id:"x",  body: { set: "      xxx" } }
>> 
>> 
>> Now, with that background info clarified: let's talk about update
>> processors
>> 
>> 
>> The docs for TimestampUpdateProcessorFactory are refering to how it
>> modifies an *input* document that it recieves (as part of the processor
>> chain). It adds the timestamp field if it's not already in the *input*
>> document, it doesn't know anything about wether that document is already
>> in the index, or if it has a value for that field in the index.
>> 
>> 
>> When processors like TimestampUpdateProcessorFactory (or any other
>> processor that modifies a *input* document) are run they don't know if the
>> document you are "indexing" already exists in the index or 

Re: SolrCloud upgrade concern

2020-05-27 Thread Erick Erickson
The biggest issue with CDCR is it’s rather fragile and requires monitoring, 
it’s not a “fire and forget” type of functionality. For instance, the use of the
tlogs as a queueing mechanism means that if, for any reason, the communications
between DCs is broken, the tlogs will grow forever until the connection is
re-established. Plus the other issues Jason pointed out.

So yes, some companies do use CDCR to communicate between separate
DCs. But they also put in some “roll your own” type of monitoring to insure
things don’t go haywire.

Alternatives:
1> use something that’s built from the ground up to provide reliable 
 messaging between DCs. Kafka or similar has been mentioned. Write
 your updates to the Kafka queue and consume them in both DCs.
 These kinds of solutions have a lot more robustness.

2> reproduce your system-of-record rather than Solr in the DCs and 
   treat the DCs as separate installations. If you adopt this approach,
  some of the streaming capabilities can be used to monitor that they stay
  in sync. For instance have a background or periodic task that’ll take a while
  for a complete run wrap two "search" streams in a "unique” decorator, 
  anything except an empty result identifies docs not on both DCs.

3> Oh Dear. This one is “interesting”. Wrap a “topic" stream on DC1 in 
an update decorator for DC2 and wrap both of those in a daemon decorator.
   That’s gobbledygook, and you’ll have to dig through the docs a bit for
   that to make sense. Essentially the topic stream is one of the very few 
   streams that does not (IIRC) require all values in the fl list be docValues.
   It fires the first time and establishes a checkpoint, finding all docs up to 
that point.
   Thereafter, it’ll get docs that have changed since the last time it ran. It 
uses a tiny 
   collection for record keeping. Each time the topic stream finds new docs, it 
passes
  them to the update stream which sends them to another DC. Wrapping the whole
  thing in a daemon decorator means it periodically runs in the background. The 
one
  shortcoming is that this approach doesn’t propagate deletes. That’s enough of 
that
  until you tell us whether it sounds worth pursuing ;)

So overall, you _can_ use CDCR to connect remote DCs, but it takes time and 
energy
to make it robust. Its advantage is that it’s entirely contained within Solr. 
But it’s not
getting much attention lately, meaning nobody has decided the functionality is 
important
enough to them to donate the time/resources to make it more robust. Were someone
to take an active interest in it, likely it could be kept around as a plugin 
that core Solr
is not responsible for.

Best,
Erick

> On May 27, 2020, at 4:43 PM, gnandre  wrote:
> 
> Thanks, Jason. This is very helpful.
> 
> I should clarify though that I am not using CDCR currently with my
> existing master-slave architecture. What I meant to say earlier was that we
> will be relying heavily on the CDCR feature if we migrate from solr
> master-slave architecture to solrcloud architecture. Are there any
> alternatives to CDCR? AFAIK, if you want to replicate between different
> data centers then CDCR is the only option. Also, when you say lot of
> customers are using SolrCloud successfully, how are they working around the
> CDCR situation? Do they not have any data center use cases? Is there some
> list maintained somewhere where one can find which companies are using
> SolrCloud successfully?
> 
> 
> 
> On Wed, May 27, 2020 at 9:27 AM Jason Gerlowski 
> wrote:
> 
>> Hi Arnold,
>> 
>> From what I saw in the community, CDCR saw an initial burst of
>> development around when it was contributed, but hasn't seen much
>> attention or improvement since.  So while it's been around for a few
>> years, I'm not sure it's improved much in terms of stability or
>> compatibility with other Solr features.
>> 
>> Some of the bigger ticket issues still open around CDCR:
>> - SOLR-11959 no support for basic-auth
>> - SOLR-12842 infinite retry of failed update-requests (leads to
>> sync/recovery problems)
>> - SOLR-12057 no real support for NRT/TLOG/PULL replicas
>> - SOLR-10679 no support for collection aliases
>> 
>> These are in addition to other more architectural issues: CDCR can be
>> a bottleneck on clusters with high ingestion rates, CDCR uses
>> full-index-replication more than traditional indexing setups, which
>> can cause issues with modern index sizes, etc.
>> 
>> So, unfortunately, no real good news in terms of CDCR maturing much in
>> recent releases.  Joel Bernstein filed a JIRA recently suggesting its
>> removal entirely actually.  Though I don't think it's gone anywhere.
>> 
>> That said, I gather from what you said that you're already using CDCR
>> successfully with Master-Slave.  If none of these pitfalls are biting
>> you in your current Master-Slave setup, you might not be bothered by
>> them any more in SolrCloud.  Most of the problems with CDCR are
>> applicable in master-slave as 

Re: SolrCloud upgrade concern

2020-05-27 Thread gnandre
Thanks, Jason. This is very helpful.

I should clarify though that I am not using CDCR currently with my
existing master-slave architecture. What I meant to say earlier was that we
will be relying heavily on the CDCR feature if we migrate from solr
master-slave architecture to solrcloud architecture. Are there any
alternatives to CDCR? AFAIK, if you want to replicate between different
data centers then CDCR is the only option. Also, when you say lot of
customers are using SolrCloud successfully, how are they working around the
CDCR situation? Do they not have any data center use cases? Is there some
list maintained somewhere where one can find which companies are using
SolrCloud successfully?



On Wed, May 27, 2020 at 9:27 AM Jason Gerlowski 
wrote:

> Hi Arnold,
>
> From what I saw in the community, CDCR saw an initial burst of
> development around when it was contributed, but hasn't seen much
> attention or improvement since.  So while it's been around for a few
> years, I'm not sure it's improved much in terms of stability or
> compatibility with other Solr features.
>
> Some of the bigger ticket issues still open around CDCR:
> - SOLR-11959 no support for basic-auth
> - SOLR-12842 infinite retry of failed update-requests (leads to
> sync/recovery problems)
> - SOLR-12057 no real support for NRT/TLOG/PULL replicas
> - SOLR-10679 no support for collection aliases
>
> These are in addition to other more architectural issues: CDCR can be
> a bottleneck on clusters with high ingestion rates, CDCR uses
> full-index-replication more than traditional indexing setups, which
> can cause issues with modern index sizes, etc.
>
> So, unfortunately, no real good news in terms of CDCR maturing much in
> recent releases.  Joel Bernstein filed a JIRA recently suggesting its
> removal entirely actually.  Though I don't think it's gone anywhere.
>
> That said, I gather from what you said that you're already using CDCR
> successfully with Master-Slave.  If none of these pitfalls are biting
> you in your current Master-Slave setup, you might not be bothered by
> them any more in SolrCloud.  Most of the problems with CDCR are
> applicable in master-slave as well as SolrCloud.  I wouldn't recommend
> CDCR if you were starting from scratch, and I still recommend you
> consider other options.  But since you're already using it with some
> success, it might be an orthogonal concern to your potential migration
> to SolrCloud.
>
> Best of luck deciding!
>
> Jason
>
> On Fri, May 22, 2020 at 7:06 PM gnandre  wrote:
> >
> > Thanks for this reply, Jason.
> >
> > I am mostly worried about CDCR feature. I am relying heavily on it.
> > Although, I am planning to use Solr 8.3. It has been long time since CDCR
> > was first introduced. I wonder what is the state of CDCR is 8.3. Is it
> > stable now?
> >
> > On Wed, Jan 22, 2020, 8:01 AM Jason Gerlowski 
> wrote:
> >
> > > Hi Arnold,
> > >
> > > The stability and complexity issues Mark highlighted in his post
> > > aren't just imagined - there are real, sometimes serious, bugs in
> > > SolrCloud features.  But at the same time there are many many stable
> > > deployments out there where SolrCloud is a real success story for
> > > users.  Small example, I work at a company (Lucidworks) where our main
> > > product (Fusion) is built heavily on top of SolrCloud and we see it
> > > deployed successfully every day.
> > >
> > > In no way am I trying to minimize Mark's concerns (or David's).  There
> > > are stability bugs.  But the extent to which those need affect you
> > > depends a lot on what your deployment looks like.  How many nodes?
> > > How many collections?  How tightly are you trying to squeeze your
> > > hardware?  Is your network flaky?  Are you looking to use any of
> > > SolrCloud's newer, less stable features like CDCR, etc.?
> > >
> > > Is SolrCloud better for you than Master/Slave?  It depends on what
> > > you're hoping to gain by a move to SolrCloud, and on your answers to
> > > some of the questions above.  I would be leery of following any
> > > recommendations that are made without regard for your reason for
> > > switching or your deployment details.  Those things are always the
> > > biggest driver in terms of success.
> > >
> > > Good luck making your decision!
> > >
> > > Best,
> > >
> > > Jason
> > >
>


Re: TimestampUpdateProcessorFactory updates the field even if the value if present

2020-05-27 Thread gnandre
Thanks for the detailed response, Chris. I am aware of the partial (atomic)
updates. Thanks for clarifying the confusion about input document vs
indexed document. I was thinking that TimestampUpdateProcessorFactory
checks if the value exists in the field inside indexed document before
updating it but actually it does check if it present inside the input
request. But the why do we require explicit processor for that? This can be
done with a simple field in schema that has default value as NOW.

I tried your idea about MinFieldValueUpdateProcessorFactory but it does not
work. Here is the configuration:


 index_time_stamp_create 
index_time_stamp_create   

I think MinFieldValueUpdateProcessorFactory keeps the min value in a
multivalued field which  index_time_stamp_create is not.

On Tue, May 26, 2020 at 2:31 PM Chris Hostetter 
wrote:

> : Subject: TimestampUpdateProcessorFactory updates the field even if the
> value
> : if present
> :
> : Hi,
> :
> : Following is the update request processor chain.
> :
> :  > <
> : processor class="solr.TimestampUpdateProcessorFactory">  : "fieldName">index_time_stamp_create   : "solr.LogUpdateProcessorFactory" />  : "solr.RunUpdateProcessorFactory" /> 
> :
> : And, here is how the field is defined in schema.xml
> :
> :  : "true" />
> :
> : Every time I index the same document, above field changes its value with
> : latest timestamp. According to TimestampUpdateProcessorFactory  javadoc
> : page, if a document does not contain a value in the timestamp field, a
> new
>
> based on the wording of your question, i suspect you are confused about
> the overall behavior of how "updating" an existing document works in solr,
> and how update processors "see" an *input document* when processing an
> add/update command.
>
>
> First off, completley ignoring TimestampUpdateProcessorFactory and
> assuming just the simplest possibel update change, let's clarify how
> "updates" work, let's assume you when you say you "index the same
> document" twice you do so with a few diff field values ...
>
> First Time...
>
> {  id:"x",  title:"" }
>
> Second time...
>
> {  id:"x",  body:"      xxx" }
>
> Solr does not implicitly know that you are trying to *update* that
> document, the final result will not be a document containing both a
> "title" field and "body" field in addition to the "id", it will *only*
> have the "id" and "body" fields and the title field will be lost.
>
> The way to "update" a document *and keep existing field values* is with
> one of the "Atomic Update" command options...
>
>
> https://lucene.apache.org/solr/guide/8_4/updating-parts-of-documents.html#UpdatingPartsofDocuments-AtomicUpdates
>
> {  id:"x",  title:"" }
>
> Second time...
>
> {  id:"x",  body: { set: "      xxx" } }
>
>
> Now, with that background info clarified: let's talk about update
> processors
>
>
> The docs for TimestampUpdateProcessorFactory are refering to how it
> modifies an *input* document that it recieves (as part of the processor
> chain). It adds the timestamp field if it's not already in the *input*
> document, it doesn't know anything about wether that document is already
> in the index, or if it has a value for that field in the index.
>
>
> When processors like TimestampUpdateProcessorFactory (or any other
> processor that modifies a *input* document) are run they don't know if the
> document you are "indexing" already exists in the index or not.  even if
> you are using the "atomic update" options to set/remove/add a field value,
> with the intent of preserving all other field values, the documents based
> down the processors chain don't include those values until the "document
> merger" logic is run -- as part of the DistributedUpdateProcessor (which
> if not explicit in your chain happens immediatly before the
> RunUpdateProcessorFactory)
>
> Off the top of my head i don't know if there is an "easy" way to have a
> Timestamp added to "new" documents, but left "as is" for existing
> documents.
>
> Untested idea
>
> explicitly configured
> DistributedUpdateProcessorFactory, so that (in addition to putting
> TimestampUpdateProcessorFactory before it) you can
> also put MinFieldValueUpdateProcessorFactory on the timestamp field
> *after* DistributedUpdateProcessorFactory (but before
> RunUpdateProcessorFactory).
>
> I think that would work?
>
> Just putting TimestampUpdateProcessorFactory after the
> DistributedUpdateProcessorFactory would be dangerous, because it would
> introduce descrepencies -- each replica would would up with it's own
> locally computed timestamp.  having the timetsamp generated before the
> distributed update processor ensures the value is computed only once.
>
> -Hoss
> http://www.lucidworks.com/
>


Re: Solr multi core query too slow

2020-05-27 Thread Erick Erickson
First of all, asking for that many rows will spend a lot of time
gathering the document fields. Assuming you have stored fields,
each doc requires
1> the aggregator node getting the candidate 10 docs from each shard

2> The aggregator node sorting those 10 docs from each shard into the true 
top 10 based on the sort criteria (score by default)

3> the aggregator node going back to the shards and asking them for those docs 
of that 10 that are resident on that shard

4> the aggregator node assembling the final docs to be sent to the client and 
sending them.

So my guess is that when you fire requests at a particular replica that has to 
get them from the other shard’s replica on another host, the network 
back-and-forth is killing your perf. It’s not that your network is having 
problems, just that you’re pushing a lot of data back and forth in your 
poorly-performing cases.

So first of all, specifying 100K rows is an anti-pattern. Outside of streaming, 
Solr is built on the presumption that you’re after the top few rows (< 100, 
say). The times vary a lot depending on whether you need to read stored fields 
BTW.

Second, I suspect your test is bogus. If you run the tests in the order you 
gave, the first one will read the necessary data from disk and probably have it 
in the OS disk cache for the second and subsequent. And/or you’re getting 
results from your queryResultCache (although you’d have to have a big one). 
Specifying the exact same query when trying to time things is usually a mistake.

If your use-case requires 100K rows, you should be using streaming or 
cursorMark. While that won’t make the end-to-end time shorter, but won’t put 
such a strain on the system.

Best,
Erick

> On May 27, 2020, at 10:38 AM, Anshuman Singh  
> wrote:
> 
> I have a Solr cloud setup (Solr 7.4) with a collection "test" having two
> shards on two different nodes. There are 4M records equally distributed
> across the shards.
> 
> If I query the collection like below, it is slow.
> http://localhost:8983/solr/*test*/select?q=*:*=10
> QTime: 6930
> 
> If I query a particular shard like below, it is also slow.
> http://localhost:8983/solr/*test_shard1_replica_n2*
> /select?q=*:*=10=*shard2*
> QTime: 5494
> *Notice shard2 in shards parameter and shard1 in the core being queried.*
> 
> But this is faster:
> http://localhost:8983/solr/*test_shard1_replica_n2*
> /select?q=*:*=10=*shard1*
> QTime: 57
> 
> This is also faster:
> http://localhost:8983/solr/*test_shard2_replica_n4*
> /select?q=*:*=10=*shard2*
> QTime: 71
> 
> I don't think it is the network as I performed similar tests with a single
> node setup as well. If you query a particular core and the corresponding
> logical shard, it is much faster than querying a different shard or core.
> 
> Why is this behaviour? How to make the first two queries work as fast as
> the last two queries?



Re: search in solrcloud on replicas

2020-05-27 Thread Erick Erickson
The base algorithm for searches picks out one replica from each
shard in a round-robin fashion, without regard to whether it’s on 
the same machine or not.

You can alter this behavior, see: 
https://lucene.apache.org/solr/guide/8_1/distributed-requests.html

When you say “the exact same search”, it isn’t quite in the sense that
it’s going to a different shard as evidenced by =false being
on the URL (I’d guess you already know that, but…). The top-level
request _may_ be forwarded as is, there’s an internal load balancer
that does this. The theory is that all the top-level requests shouldn’t
be handled by the same Solr instance if a client is directly using
the http address of a single node in the cluster for all requests.

Best,
Erick



> On May 27, 2020, at 11:12 AM, Odysci  wrote:
> 
> Hi,
> 
> I have a question regarding solrcloud searches on both replicas of an index.
> I have a solrcloud setup with 2 physical machines (let's call them A and
> B), and my index is divided into 2 shards, and 2 replicas, such that each
> machine has a full copy of the index. My Zookeeper setup uses 3 instances.
> The nodes and replicas are as follows:
> Machine A:
>  core_node3 / shard1_replica_n1
>  core_node7 / shard2_replica_n4
> Machine B:
>  core_node5 / shard1_replica_n2
>  core_node8 / shard2_replica_n6
> 
> I'm using solrJ and I create the solr client using Http2SolrClient.Builder
> and the IP of machineA.
> 
> Here is my question:
> when I do a search (using solrJ) and I look at the search logs on both
> machines, I see that the same search is being executed on both machines.
> But if the full index is present on both machines, wouldn't it be enough
> just to search on one of machines?
> In fact, if I turn off machine B, the search returns the correct results
> anyway.
> 
> Thanks a lot.
> 
> Reinaldo



search in solrcloud on replicas

2020-05-27 Thread Odysci
Hi,

I have a question regarding solrcloud searches on both replicas of an index.
I have a solrcloud setup with 2 physical machines (let's call them A and
B), and my index is divided into 2 shards, and 2 replicas, such that each
machine has a full copy of the index. My Zookeeper setup uses 3 instances.
The nodes and replicas are as follows:
Machine A:
  core_node3 / shard1_replica_n1
  core_node7 / shard2_replica_n4
Machine B:
  core_node5 / shard1_replica_n2
  core_node8 / shard2_replica_n6

I'm using solrJ and I create the solr client using Http2SolrClient.Builder
and the IP of machineA.

Here is my question:
when I do a search (using solrJ) and I look at the search logs on both
machines, I see that the same search is being executed on both machines.
But if the full index is present on both machines, wouldn't it be enough
just to search on one of machines?
In fact, if I turn off machine B, the search returns the correct results
anyway.

Thanks a lot.

Reinaldo


Solr multi core query too slow

2020-05-27 Thread Anshuman Singh
I have a Solr cloud setup (Solr 7.4) with a collection "test" having two
shards on two different nodes. There are 4M records equally distributed
across the shards.

If I query the collection like below, it is slow.
http://localhost:8983/solr/*test*/select?q=*:*=10
QTime: 6930

If I query a particular shard like below, it is also slow.
http://localhost:8983/solr/*test_shard1_replica_n2*
/select?q=*:*=10=*shard2*
QTime: 5494
*Notice shard2 in shards parameter and shard1 in the core being queried.*

But this is faster:
http://localhost:8983/solr/*test_shard1_replica_n2*
/select?q=*:*=10=*shard1*
QTime: 57

This is also faster:
http://localhost:8983/solr/*test_shard2_replica_n4*
/select?q=*:*=10=*shard2*
QTime: 71

I don't think it is the network as I performed similar tests with a single
node setup as well. If you query a particular core and the corresponding
logical shard, it is much faster than querying a different shard or core.

Why is this behaviour? How to make the first two queries work as fast as
the last two queries?


Re: SolrCloud upgrade concern

2020-05-27 Thread Jason Gerlowski
Hi Arnold,

>From what I saw in the community, CDCR saw an initial burst of
development around when it was contributed, but hasn't seen much
attention or improvement since.  So while it's been around for a few
years, I'm not sure it's improved much in terms of stability or
compatibility with other Solr features.

Some of the bigger ticket issues still open around CDCR:
- SOLR-11959 no support for basic-auth
- SOLR-12842 infinite retry of failed update-requests (leads to
sync/recovery problems)
- SOLR-12057 no real support for NRT/TLOG/PULL replicas
- SOLR-10679 no support for collection aliases

These are in addition to other more architectural issues: CDCR can be
a bottleneck on clusters with high ingestion rates, CDCR uses
full-index-replication more than traditional indexing setups, which
can cause issues with modern index sizes, etc.

So, unfortunately, no real good news in terms of CDCR maturing much in
recent releases.  Joel Bernstein filed a JIRA recently suggesting its
removal entirely actually.  Though I don't think it's gone anywhere.

That said, I gather from what you said that you're already using CDCR
successfully with Master-Slave.  If none of these pitfalls are biting
you in your current Master-Slave setup, you might not be bothered by
them any more in SolrCloud.  Most of the problems with CDCR are
applicable in master-slave as well as SolrCloud.  I wouldn't recommend
CDCR if you were starting from scratch, and I still recommend you
consider other options.  But since you're already using it with some
success, it might be an orthogonal concern to your potential migration
to SolrCloud.

Best of luck deciding!

Jason

On Fri, May 22, 2020 at 7:06 PM gnandre  wrote:
>
> Thanks for this reply, Jason.
>
> I am mostly worried about CDCR feature. I am relying heavily on it.
> Although, I am planning to use Solr 8.3. It has been long time since CDCR
> was first introduced. I wonder what is the state of CDCR is 8.3. Is it
> stable now?
>
> On Wed, Jan 22, 2020, 8:01 AM Jason Gerlowski  wrote:
>
> > Hi Arnold,
> >
> > The stability and complexity issues Mark highlighted in his post
> > aren't just imagined - there are real, sometimes serious, bugs in
> > SolrCloud features.  But at the same time there are many many stable
> > deployments out there where SolrCloud is a real success story for
> > users.  Small example, I work at a company (Lucidworks) where our main
> > product (Fusion) is built heavily on top of SolrCloud and we see it
> > deployed successfully every day.
> >
> > In no way am I trying to minimize Mark's concerns (or David's).  There
> > are stability bugs.  But the extent to which those need affect you
> > depends a lot on what your deployment looks like.  How many nodes?
> > How many collections?  How tightly are you trying to squeeze your
> > hardware?  Is your network flaky?  Are you looking to use any of
> > SolrCloud's newer, less stable features like CDCR, etc.?
> >
> > Is SolrCloud better for you than Master/Slave?  It depends on what
> > you're hoping to gain by a move to SolrCloud, and on your answers to
> > some of the questions above.  I would be leery of following any
> > recommendations that are made without regard for your reason for
> > switching or your deployment details.  Those things are always the
> > biggest driver in terms of success.
> >
> > Good luck making your decision!
> >
> > Best,
> >
> > Jason
> >


Re: unified highlighter performance in solr 8.5.1

2020-05-27 Thread David Smiley
try setting hl.fragsizeIsMinimum=true
I did some benchmarking and found that this helps quite a bit


BTW I used the highlights.alg benchmark file, with some changes to make it
more reflective of your scenario -- offsets in postings, and used "enwiki"
(english wikipedia) docs which are larger than the Reuters ones (so it
appears, any way).  I had to do a bit of hacking to use the
"LengthGoalBreakIterator, which wasn't previously used by this framework.

~ David


On Tue, May 26, 2020 at 4:42 PM Michal Hlavac  wrote:

> fine, I'l try to write simple test, thanks
>
>
>
> On utorok 26. mája 2020 17:44:52 CEST David Smiley wrote:
>
> > Please create an issue.  I haven't reproduced it yet but it seems
> unlikely
>
> > to be user-error.
>
> >
>
> > ~ David
>
> >
>
> >
>
> > On Mon, May 25, 2020 at 9:28 AM Michal Hlavac  wrote:
>
> >
>
> > > Hi,
>
> > >
>
> > > I have field:
>
> > > 
> > > stored="true" indexed="false" storeOffsetsWithPositions="true"/>
>
> > >
>
> > > and configuration:
>
> > > true
>
> > > unified
>
> > > true
>
> > > content_txt_sk_highlight
>
> > > 2
>
> > > true
>
> > >
>
> > > Doing query with hl.bs.type=SENTENCE it takes around 1000 - 1300 ms
> which
>
> > > is really slow.
>
> > > Same query with hl.bs.type=WORD takes from 8 - 45 ms
>
> > >
>
> > > is this normal behaviour or should I create issue?
>
> > >
>
> > > thanks, m.
>
> > >
>
> >
>
>