Re: Fastest way to index data to solr

2022-09-29 Thread Gus Heck
70 million can be a lot or a little. Doc count is not even half the story. How much storage space do these documents occupy in the database? Is the text tweet sized, or multi-megabyte sized clobs, or links files on a file store that need to be fetched and parsed (or OCR'd or converted from

Re: Fastest way to index data to solr

2022-09-29 Thread Shawn Heisey
On 9/29/22 22:28, Gus Heck wrote: * Do NOT commit during the bulk load, wait until the end Unless something changed this is slightly risky. It can lead to very large transaction logs and very long playback of the tx log on startup. It is always good practice to have autoCommit configured with

Re: Fastest way to index data to solr

2022-09-29 Thread Gus Heck
> > * Do NOT commit during the bulk load, wait until the end > Unless something changed this is slightly risky. It can lead to very large transaction logs and very long playback of the tx log on startup. If Solr goes down during indexing to something like an OOM, it could take a very long time

Re: NullPointer Exception when using Cross Collection Join

2022-09-29 Thread Joel Bernstein
Can you share the stack trace? Also in the Solr log there will be a call to the /export handler. Can you get that from the log? Then we can isolate the call to the export handler and see if we can reproduce it. Joel Bernstein http://joelsolr.blogspot.com/ On Thu, Sep 29, 2022 at 3:01 PM

Re: NPE in collapse

2022-09-29 Thread Joel Bernstein
Oh, there are no segments... If this error is still occurring in the latest Solr version without top_fc hint then it's a bug. Joel Bernstein http://joelsolr.blogspot.com/ On Thu, Sep 29, 2022 at 3:27 PM Joel Bernstein wrote: > What version of Solr are you using? > > Try removing the top_fc

Re: NPE in collapse

2022-09-29 Thread Joel Bernstein
What version of Solr are you using? Try removing the top_fc hint, does the error still occur? Joel Bernstein http://joelsolr.blogspot.com/ On Thu, Sep 29, 2022 at 12:47 PM 南拓弥 wrote: > Hello all, > > NPE in collapse hint=top_fc, > When there are no segments, using hint=top_fc in collapse

Re: NullPointer Exception when using Cross Collection Join

2022-09-29 Thread Mikhail Khludnev
Hi, Sean. It's not clear if it can be reproduced with bare solr dstro install, indexing a few docs and querying it; or it's something about hacking/customising it as a library? On Thu, Sep 29, 2022 at 7:42 PM Sean Wu wrote: > Hi, Solr team. > > I'm using Solr 9.0.0 and when I query with Cross

Re: Conditional Joins in Solr

2022-09-29 Thread Mikhail Khludnev
Hi, Jason. Could it be something like q=id:* -Group:2 -{!join from=id to=linkedIDs}Group:2 ? On Thu, Sep 29, 2022 at 7:47 PM Kahler, Jason J (US) wrote: > Is it possible to have a solr join query only apply under certain > conditions? We have a solr document store that performs access control >

Re: Fastest way to index data to solr

2022-09-29 Thread Dave
Another way to handle this is have your indexing code fork out to as many cores as the solr indexing server has. It’s way less work to force the code to run itself that many times in parallel, and as long as your sql queries and said tables are properly indexed the database shouldn’t be a

Re: Fastest way to index data to solr

2022-09-29 Thread Andy Lester
> On Sep 29, 2022, at 4:17 AM, Jan Høydahl wrote: > > * Index with multiple threads on the client, experiment to find a good number > based on the number of CPUs on receiving side That may also mean having multiple clients. We went from taking about 8 hours to index our entire 42M rows to

Hadoop vulnerability in Solr 8.11.2 from scan

2022-09-29 Thread Richard Li
Hi, Our vulnerability scanning tool found a vulnerability from Hadoop in Solr 8.11.2. More specifically, it is introduced through org.apache.solr:solr-core@8.11.2 › org.apache.hadoop:hadoop-common@3.2.2. The published vulnerability is listed as CVE-2022-25168:

NPE in collapse

2022-09-29 Thread 南拓弥
Hello all, NPE in collapse hint=top_fc, When there are no segments, using hint=top_fc in collapse results in NPE. * query http://localhost:8983/solr/bukken/select?fq={!collapse field=str_field=top_fc}=true=OR=*:*= * response "error":{ "msg":"Cannot invoke

Conditional Joins in Solr

2022-09-29 Thread Kahler, Jason J (US)
Is it possible to have a solr join query only apply under certain conditions? We have a solr document store that performs access control following various rules related to the data stored in solr. Consider the following scenario { Id:"doc1" linkedIDs:"doc2" Desc:"desc 1" Group:"1" } {

NullPointer Exception when using Cross Collection Join

2022-09-29 Thread Sean Wu
Hi, Solr team. I'm using Solr 9.0.0 and when I query with Cross Collection Join on our own data. There is a NullPointer Exception. Error line: solr-9.0.0/solr/core/src/java/org/apache/solr/handler/export/ExportWriter.java#803 DocIdSetIterator it = new BitSetIterator(bits, 0); It seems like

Re: Fastest way to index data to solr

2022-09-29 Thread Jan Høydahl
Hi, If you want to index fast you shold * Make sure you have enough hardware on the solr side to handle the bulk load * Index with multiple threads on the client, experiment to find a good number based on the number of CPUs on receiving side * If using JAVA on client, use CloudSolrClient which

Fastest way to index data to solr

2022-09-29 Thread Shankar R
Hi, We are having nearly 70-80 millions of data which need to be indexed in solr 8.6.1. We want to choose between Java BInary format or direct JSON format. Our source data is DBMS which is a structured data. Regards Ravi

Issues with rename command of the Collections API

2022-09-29 Thread Jesús Roca
Hi, I had tried to use the rename command of the collection API to rename a SolrCloud collection but I couldn't get it to work properly. I'm using Solr 8.11.2 and when I try to rename a collection called "test" to "test-new" with the following command: