Re: solrcloud with EKS kubernetes

2020-12-23 Thread Abhishek Mishra
Hi Jonathan,
Merry Christmas.
Thanks for the suggestion. To manage IOPS can we do something on
rate-limiting behalf?

Regards,
Abhishek


On Thu, Dec 17, 2020 at 5:07 AM Jonathan Tan  wrote:

> Hi Abhishek,
>
> We're running Solr Cloud 8.6 on GKE.
> 3 node cluster, running 4 cpus (configured) and 8gb of min & max JVM
> configured, all with anti-affinity so they never exist on the same node.
> It's got 2 collections of ~13documents each, 6 shards, 3 replicas each,
> disk usage on each node is ~54gb (we've got all the shards replicated to
> all nodes)
>
> We're also using a 200gb zonal SSD, which *has* been necessary just so that
> we've got the right IOPS & bandwidth. (That's approximately 6000 IOPS for
> read & write each, and 96MB/s for read & write each)
>
> Various lessons learnt...
> You definitely don't want them ever on the same kubernetes node. From a
> resilience perspective, yes, but also when one SOLR node gets busy, they
> tend to all get busy, so now you'll have resource contention. Recovery can
> also get very busy and resource intensive, and again, sitting on the same
> node is problematic. We also saw the need to move to SSDs because of how
> IOPS bound we were.
>
> Did I mention use SSDs? ;)
>
> Good luck!
>
> On Mon, Dec 14, 2020 at 5:34 PM Abhishek Mishra 
> wrote:
>
> > Hi Houston,
> > Sorry for the late reply. Each shard has a 9GB size around.
> > Yeah, we are providing enough resources to pods. We are currently
> > using c5.4xlarge.
> > XMS and XMX is 16GB. The machine is having 32 GB and 16 core.
> > No, I haven't run it outside Kubernetes. But I do have colleagues who did
> > the same on 7.2 and didn't face any issue regarding it.
> > Storage volume is gp2 50GB.
> > It's not the search query where we are facing inconsistencies or
> timeouts.
> > Seems some internal admin APIs sometimes have issues. So while adding new
> > replica in clusters sometimes result in inconsistencies. Like recovery
> > takes some time more than one hour.
> >
> > Regards,
> > Abhishek
> >
> > On Thu, Dec 10, 2020 at 10:23 AM Houston Putman  >
> > wrote:
> >
> > > Hello Abhishek,
> > >
> > > It's really hard to provide any advice without knowing any information
> > > about your setup/usage.
> > >
> > > Are you giving your Solr pods enough resources on EKS?
> > > Have you run Solr in the same configuration outside of kubernetes in
> the
> > > past without timeouts?
> > > What type of storage volumes are you using to store your data?
> > > Are you using headless services to connect your Solr Nodes, or
> ingresses?
> > >
> > > If this is the first time that you are using this data + Solr
> > > configuration, maybe it's just that your data within Solr isn't
> optimized
> > > for the type of queries that you are doing.
> > > If you have run it successfully in the past outside of Kubernetes,
> then I
> > > would look at the resources that you are giving your pods and the
> storage
> > > volumes that you are using.
> > > If you are using Ingresses, that might be causing slow connections
> > between
> > > nodes, or between your client and Solr.
> > >
> > > - Houston
> > >
> > > On Wed, Dec 9, 2020 at 3:24 PM Abhishek Mishra 
> > > wrote:
> > >
> > > > Hello guys,
> > > > We are kind of facing some of the issues(Like timeout etc.) which are
> > > very
> > > > inconsistent. By any chance can it be related to EKS? We are using
> solr
> > > 7.7
> > > > and zookeeper 3.4.13. Should we move to ECS?
> > > >
> > > > Regards,
> > > > Abhishek
> > > >
> > >
> >
>


distrib.requestTimes and distrib.totalTime metric always show 0 for any sub-metric

2020-12-23 Thread gnandre
*distrib.requestTimes and *distrib.totalTime metric always show 0 for any
sub-metric. Only *local.requestTimes and *local.totalTime metric have
non-zero values. This is when we hit solr:8983/solr/admin/metrics endpoint.

e.g.

  "QUERY./select.distrib.requestTimes":{
"count":0,
"meanRate":0.0,
"1minRate":0.0,
"5minRate":0.0,
"15minRate":0.0,
"min_ms":0.0,
"max_ms":0.0,
"mean_ms":0.0,
"median_ms":0.0,
"stddev_ms":0.0,
"p75_ms":0.0,
"p95_ms":0.0,
"p99_ms":0.0,
"p999_ms":0.0},


  "QUERY./select.local.requestTimes":{
"count":921,
"meanRate":0.016278013505962197,
"1minRate":0.02502213358051701,
"5minRate":0.01792972725206014,
"15minRate":0.016913129796499247,
"min_ms":0.092099,
"max_ms":27.833606,
"mean_ms":1.5546483254237826,
"median_ms":0.211898,
"stddev_ms":2.353088809601306,
"p75_ms":0.278897,
"p95_ms":5.547842,
"p99_ms":5.547842,
"p999_ms":9.239902},


  "QUERY./select.requestTimes":{
"count":921,
"meanRate":0.01627801345713971,
"1minRate":0.02502213358051701,
"5minRate":0.01792972725206014,
"15minRate":0.016913129796499247,
"min_ms":0.094899,
"max_ms":27.840706,
"mean_ms":1.5588447262406753,
"median_ms":0.216198,
"stddev_ms":2.352629359382386,
"p75_ms":0.284497,
"p95_ms":5.551242,
"p99_ms":5.551242,
"p999_ms":9.242902},


I am using the 8.5.2 version of Solr in standalone mode.I have some queries
that are distributed in the sense that they use shards parameter to
distribute the query among different cores. I was expecting that the
distrib metric would have some value when I execute these distributed
queries.

Also, what is the need of a third metric besides local and distrib?


Re: Indexing performance 7.3 vs 8.7

2020-12-23 Thread Bram Van Dam
On 23/12/2020 16:00, Ron Buchanan wrote:
>   - both run Java 1.8, but 7.3 is running HotSpot and 8.7 is running
>   OpenJDK (and a bit newer)

If you're using G1GC, you probably want to give Java 11 a go. It's an
easy thing to test, and it's had a positive impact for us. Your mileage
may vary.

 - Bram


Indexing performance 7.3 vs 8.7

2020-12-23 Thread Ron Buchanan
(this is long, just trying to be thorough)

I'm working on upgrading from Solr 7.3 to Solr 8.7 and I am seeing a
significant drop in indexing throughput during a full index reload - from
~1300 documents per second to ~450 documents/sec

Background:

VM hosts (these are configured identically):


   - Our Solr clusters run in a virtualized environment.
  - Each Virtual Machine has 8 CPUs and 64Gb RAM.
  - The hosts are organized into 2 4-host clusters - one for 7.3 and
  one for 8.7.
  - Each cluster has its own 3 VM Zookeeper cluster (running the
  version that was current at the time of install).


JVM:


   - all the JVMs are set-up with -Xms28G and -Xmx28Gb
  - the Solr 8.7 cluster is running with the default JVM settings
  (i.e., as configured by the Solr install script) **other than memory**
  - the Solr 7.3 cluster was configured awhile ago, but I'm fairly sure
  it's running pretty vanilla JVM settings (if not outright
default) **other
  than memory**
  - the most obvious difference between the JVM settings for the
  environments is the garbage collector: ConcurrentMarkSweep for
7.3 and G1GC
  for 8.7
  - both run Java 1.8, but 7.3 is running HotSpot and 8.7 is running
  OpenJDK (and a bit newer)


Solr:


   - 1 shard, 1 replica per host - all NRT (both clusters)
  - Both the Solr 7.3 and 8.7 clusters are running the same schema
  - with one exception, only the most minimal changes were made to the
  default Solr 8.7 solrconfig.xml to keep it in-line with the 7.3
solrconfig
  (mostly around Cache settings)
 - the exception: running with luceneMatchVersion=7.3.0


Data Loading:


   - Data is loaded by a completely separate VM running a custom Java
  process that collects data from source and generates SolrInputDocuments
  from that source and sends it via CloudSolrClient
  - this Java process is multi-threaded with an upper-limit on the
  number of simultaneous threads sending documents and the size of the
  document payload
  - we are loading ~10 million documents during a full-reload - this is
  a product catalog, so the documents actually represent data about SKUs we
  sell (and they aren't particularly large, though the size is variable)
  - the existing Solr 7.3 cluster has a full-reload time of around 2.5
  hours, the Solr 8.7 cluster requires around 6.25 hours


Efforts so far:

   - checked network speed from the VM generating updates (it's the same
   server for both 7.3 and 8.7) and the clusters
  - performance to the 8.7 cluster is actually better
   - as best as possible, controlling for VM topology (i.e., distribution
   of the VMs across hosts within the VM cluster)
   - real-time JVM monitoring with VisualVM during indexing on 8.7 cluster
  - looked nice - same as I've always seen for the 7.3 cluster
   - checked the GC logs with GCEasy
  - reported as healthy


Thoughts/questions/considerations:

   - could running an older LuceneMatchVersion affect indexing performance?
   - still a little concerned that the VM topology is affecting things (our
   VM-crew split the 7.3 cluster across VM clusters in an attempt to improve
   resiliency in case VM cluster failure and that's not something we can or
   want to replicate) - that said, the performance difference is consistent
   with what I've seen in our QA environment and that environment has a less
   even spread of VMs across hosts (e.g., multiple Solr VMs on the same VM
   host)
   - we have a couple of custom tokenizers and tokenFilters - those were
   rebuilt using the 8.7.0 versions of solr-core and apache-core - they're
   pretty simple and I'm not terribly concerned about this, but it is
   non-standard
   - query performance is comparable between 7.3 and 8.7 and documents
   returned are reasonably consistent (few really big differences, mostly just
   scoring differences that affect ordering)
   - after watching the 8.7 JVMs in real-time during indexing, I decided to
   drop the memory to -Xms20g and -Xmx20g - this had no effect on indexing
   speed (or GC impacts) - so, I think it's at least safe to say this is not
   memory-bound


Final question:

is it simply typical to see significantly worse indexing performance on 8.7
than 7.3?

Any suggestions on where to look would be highly appreciated.

Thanks,

Ron


Re: Data Import Handler (DIH) - Installing and running

2020-12-23 Thread Erick Erickson
Have you done what the message says and looked at your Solr log? If so,
what information is there?

> On Dec 23, 2020, at 5:13 AM, DINSD | SPAutores 
>  wrote:
> 
> Hi,
> 
> I'm trying to install the package "data-import-handler", since it was 
> discontinued from core SolR distro.
> 
> https://github.com/rohitbemax/dataimporthandler
> 
> However, as soon as the first command is carried out
> 
> solr -c -Denable.packages=true
> 
> I get this screen in web interface
> 
> 
> 
> Has anyone been through this, or have any idea why it's happening ?
> 
> Thanks for any help
> Rui Pimentel
> 
> 
> 
> DINSD - Departamento de Informática / SPA Digital
> Av. Duque de Loulé, 31 - 1069-153 Lisboa  PORTUGAL
> T (+ 351) 21 359 44 36 / (+ 351) 21 359 44 00  F (+ 351) 21 353 02 57
>  informat...@spautores.pt
>  www.SPAutores.pt
> 
> Please consider the environment before printing this email 
> 
> Esta mensagem electrónica, incluindo qualquer dos seus anexos, contém 
> informação PRIVADA, CONFIDENCIAL e de DIVULGAÇÃO PROIBIDA,e destina-se 
> unicamente à pessoa e endereço electrónico acima indicados. Se não for o 
> destinatário desta mensagem, agradecemos que a elimine e nos comunique de 
> imediato através do telefone  +351 21 359 44 00 ou por email para: 
> ge...@spautores.pt 
> 
> This electronic mail transmission including any attachment hereof, contains 
> information that is PRIVATE, CONFIDENTIAL and PROTECTED FROM DISCLOSURE, and 
> it is only for the use of the person and the e-mail address above indicated. 
> If you have received this electronic mail transmission in error, please 
> destroy it and notify us immediately through the telephone number  +351 21 
> 359 44 00 or at the e-mail address:  ge...@spautores.pt
>  



Data Import Handler (DIH) - Installing and running

2020-12-23 Thread DINSD | SPAutores

Hi,

I'm trying to install the package "data-import-handler", since it was 
discontinued from core SolR distro.


https://github.com/rohitbemax/dataimporthandler

However, as soon as the first command is carried out

solr -c -Denable.packages=true

I get this screen in web interface

Has anyone been through this, or have any idea why it's happening ?

Assinatura SPA Thanks for any help
**
*Rui Pimentel*


**
*DINSD - Departamento de Informática / SPA Digital*
Av. Duque de Loulé, 31 - 1069-153 Lisboa PORTUGAL
*T * (+ 351) 21 359 44 36 */* (+ 351) 21 359 44 00 *F* (+ 351) 21 353 02 57
 informat...@spautores.pt
www.SPAutores.pt
 
 


Please consider the environment before printing this email

Esta mensagem electrónica, incluindo qualquer dos seus anexos, contém 
informação PRIVADA, CONFIDENCIAL e de DIVULGAÇÃO PROIBIDA,e destina-se 
unicamente à pessoa e endereço electrónico acima indicados. Se não for o 
destinatário desta mensagem, agradecemos que a elimine e nos comunique 
de imediato através do telefone +351 21 359 44 00 ou por email para: 
ge...@spautores.pt 


This electronic mail transmission including any attachment hereof, 
contains information that is PRIVATE, CONFIDENTIAL and PROTECTED FROM 
DISCLOSURE, and it is only for the use of the person and the e-mail 
address above indicated. If you have received this electronic mail 
transmission in error, please destroy it and notify us immediately 
through the telephone number +351 21 359 44 00 or at the e-mail address: 
ge...@spautores.pt

Assinatura SPA