Re: Some performance questions....

2018-03-16 Thread Walter Underwood
> On Mar 16, 2018, at 3:26 PM, Deepak Goel wrote: > > Can you please post results of your test? > > Please tell us the tps at 25%, 50%, 75%, 100% of your CPU resource I could, but it probably would not be useful for your documents or your queries. We have 22 million

Re: Solr 6.6.3: Errors when using facet.field

2018-03-16 Thread Jay Potharaju
This is my query: facet=true=true=true=product_id=true=category_id Field def: Tried adding both docvalues & without docvalues. Shards: 2 Has anyone else experienced this error? Thanks Thanks Jay Potharaju On Fri, Mar 16, 2018 at 2:20 PM, Jay Potharaju wrote: >

Re: Some performance questions....

2018-03-16 Thread Deepak Goel
On Sat, Mar 17, 2018 at 3:11 AM, Walter Underwood wrote: > > On Mar 16, 2018, at 1:21 PM, Deepak Goel wrote: > > > > However a single client object with thousands of queries coming in would > > surely become a bottleneck. I can test this scenario too. >

Re: Some performance questions....

2018-03-16 Thread Deepak Goel
On Sat, Mar 17, 2018 at 2:56 AM, Shawn Heisey wrote: > On 3/16/2018 2:21 PM, Deepak Goel wrote: > > I wanted to test how many max connections can Solr handle concurrently. > > Also I would have to implement an 'connection pooling' of the > client-object > > connections

Re: Some performance questions....

2018-03-16 Thread Walter Underwood
> On Mar 16, 2018, at 1:21 PM, Deepak Goel wrote: > > However a single client object with thousands of queries coming in would > surely become a bottleneck. I can test this scenario too. No it isn’t. The single client object is thread-safe and manages a pool of connections.

Re: Some performance questions....

2018-03-16 Thread Shawn Heisey
On 3/16/2018 2:21 PM, Deepak Goel wrote: > I wanted to test how many max connections can Solr handle concurrently. > Also I would have to implement an 'connection pooling' of the client-object > connections rather than a single connection thread > > However a single client object with thousands of

Re: Solr 6.6.3: Errors when using facet.field

2018-03-16 Thread Jay Potharaju
It looks like it was fixed as part of 6.6.3 : SOLR-6160 . FYI: I have 2 shards in my test environment. Thanks Jay Potharaju On Fri, Mar 16, 2018 at 2:07 PM, Jay Potharaju wrote: > Hi, > I am running a simple query with

Solr 6.6.3: Errors when using facet.field

2018-03-16 Thread Jay Potharaju
Hi, I am running a simple query with group by & faceting. facet=true=true=true=product_id=true=true=product_id=1 When I run the query I get errors org.apache.solr.common.SolrException java.lang.IllegalStateException

Solrj Analytics component

2018-03-16 Thread Asmaa Shoala
Hello, I want to use analytics component(https://lucene.apache.org/solr/guide/7_2/analytics.html#analytic-pivot-facets) in java code but i didn't find any guide over the internet . Can you please help me? Thanks, Asmaa Ramzy Shoala novomind Egypt LLC _ 7 Abou

Re: Adding Documents to Solr by using Java Client API is failed

2018-03-16 Thread Andy Tang
Erik, Thank you for reminding. javac -cp .:/opt/solr/solr-6.6.2/dist/*:/opt/solr/solr-6.6.2/dist/solrj-lib/* AddingDocument.java java -cp .:/opt/solr/solr-6.6.2/dist/*:/opt/solr/solr-6.6.2/dist/solrj-lib/* AddingDocument SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder". SLF4J:

RE: Some performance questions....

2018-03-16 Thread Davis, Daniel (NIH/NLM) [C]
Deepak, A better test of multi-user support might be to vary the queries and try to simulate a realistic 'working set' of search data. I've made this same performance analysis mistake with the search index of www.indexengines.com, which I developed (in part). Somewhat different from Lucene,

Recovering from machine failure

2018-03-16 Thread Andy C
Running Solr 7.2 in SolrCloud mode with 5 Linux VMs. Each VM was a single shard, no replication. Single Zookeeper instance running on the same VM as one of the Solr instances. IT was making changes, and 2 of the VMs won't reboot (including the VM where Zookeeper is installed). There was a

Re: Some performance questions....

2018-03-16 Thread Deepak Goel
On Sat, Mar 17, 2018 at 1:06 AM, Shawn Heisey wrote: > On 3/16/2018 7:38 AM, Deepak Goel wrote: > > I did a performance study of Solr a while back. And I found that it does > > not scale beyond a particular point on a single machine (could be due to > > the way its coded).

Re: Adding Documents to Solr by using Java Client API is failed

2018-03-16 Thread Erick Erickson
this is the important bit: java.lang.NoClassDefFoundError: org/apache/http/Header That class is not defined in the Solr code at all, it's in httpcore-#.#.#.jar You probably need to include /opt/solr/solr-6.6.2/dist/solrj-lib in your classpath. Best, Erick On Fri, Mar 16, 2018 at 12:14 PM,

Re: Some performance questions....

2018-03-16 Thread Shawn Heisey
On 3/16/2018 7:38 AM, Deepak Goel wrote: > I did a performance study of Solr a while back. And I found that it does > not scale beyond a particular point on a single machine (could be due to > the way its coded). Hence multiple instances might make sense. > >

Adding Documents to Solr by using Java Client API is failed

2018-03-16 Thread Andy Tang
I have the code to add document to Solr. I tested it in Both Solr 6.6.2 and Solr 7.2.1 and failed. import java.io.IOException; import org.apache.solr.client.solrj.SolrClient; import org.apache.solr.client.solrj.SolrServerException; import org.apache.solr.client.solrj.impl.HttpSolrClient; import

Re: statistics in hitlist

2018-03-16 Thread Joel Bernstein
With regression you're looking at how the change in one variable effects the change in another variable. So you need to have values that are changing. What you described is an average of field X which is not changing, regressed against the value of X. I think one approach to this is to regress

QueryElevator prepare() in in distributed search

2018-03-16 Thread Markus Jelsma
Hello, QueryElevator.prepare() runs five times for a single query in distributed search, this is probably not how it should be, but in what phase of distributed search is it supposed to actually run? Many thanks, Markus

Re: Some performance questions....

2018-03-16 Thread Deepak Goel
> That benchmark is on Windows, so not interesting for most of us. I guess I must have missed this in the author's question. Did he describe his OS? Also other applications scale well on Windows. Why would Solr be different? The Solr page does not say about any performance limits on windows

Re: Some performance questions....

2018-03-16 Thread Deepak Goel
> On Mar 16, 2018, at 6:26 AM, Deepak Goel wrote: > > I would try multiple Solr instances rather a single Solr instance (it > definitely will give a performance boost) > I would avoid multiple Solr instances on single machine. I can use all 36 cores on our servers with one Solr

Re: Solr document routing using composite key

2018-03-16 Thread Erick Erickson
What Shawn said. 117 shards and 116 docs tells you absolutely nothing useful. I've never seen the number of docs on various shards be off by more than 2-3% when enough docs are indexed to be statistically valid. Best, Erick On Fri, Mar 16, 2018 at 5:34 AM, Shawn Heisey

Re: Recommendations for non-narrative data

2018-03-16 Thread Erick Erickson
For an index that size, you have a lot of options. I'd completely ignore any discussion that starts with "but our index will be bigger if we do that" until it's proven to be a problem. For reference, I commonly see 200G-300G indexes so Ok, to your problem. Your update rate is very low so

Re: Some performance questions....

2018-03-16 Thread Walter Underwood
On Mar 16, 2018, at 6:38 AM, Deepak Goel wrote: > > I did a performance study of Solr a while back. And I found that it does > not scale beyond a particular point on a single machine (could be due to > the way its coded). Hence multiple instances might make sense. > >

Re: Some performance questions....

2018-03-16 Thread Walter Underwood
> On Mar 16, 2018, at 6:26 AM, Deepak Goel wrote: > > I would try multiple Solr instances rather a single Solr instance (it > definitely will give a performance boost) I would avoid multiple Solr instances on single machine. I can use all 36 cores on our servers with one

Re: question regarding wildcard-searches

2018-03-16 Thread Erick Erickson
If you goal is to search prefixes only, I'd go away from the _text_ field all together and use a "string" type. This will mean you need to 1> make it multiValued=true 2> split this up (either on your client or use a FieldMutatingUpdateProcessor, probably RegexReplaceProcessorFactory) into separate

Re: In Place Updates not work as expected

2018-03-16 Thread Emir Arnautović
Hi, That’s how you build regular document. Incremental/atomic updates need to use update commands. Did not check latest Solrj, so maybe there is built in way of doing that, but quick googling showed how it can be achieved: SolrInputDocument doc2 = new SolrInputDocument();

Re: question regarding wildcard-searches

2018-03-16 Thread Emir Arnautović
Hi Roel, As mentioned, _text_ field probably does not contain complete “EO.1954.53.1” but only its parts. You can verify that using snalysis screen in admin console. What you can try is searching for phrase without wildcard “EO.1954.53” or if you are using WordDelimiterTokenFilter in your

solr equivalent for elasticsearch 'terminate_after' param

2018-03-16 Thread Martin Buechler
Hi, In order to decide, if any search result exists for a given query, you can do this in ES efficiently using 'size=0_after=1', see https://www.elastic.co/guide/en/elasticsearch/reference/current/search-request-body.html#_fast_check_for_any_matching_docs Is there an equivalent for the

Re: In Place Updates not work as expected

2018-03-16 Thread mganeshs
Hi Emir, It's normal setfield and addDocument for ex. in a for loop solrInputDocument.setField(sFieldId, fieldValue); and after this, we add the created document. solrClient.add(collectionName, solrInputDocuments); I just want to know whether, we need to do something specific for

Recommendations for non-narrative data

2018-03-16 Thread Christopher Schultz
All, I'm using Solr to index and search a database of user data (username, email, first and last name), so there aren't really "terms" in the data to search for, like you might search for words that describe products in a catalog, for example. I have set up my schema to include plain-old text

Re: Some performance questions....

2018-03-16 Thread Deepak Goel
On Fri, Mar 16, 2018 at 6:03 PM, Shawn Heisey wrote: > On 3/15/2018 6:34 AM, BlackIce wrote: > >> However the main app that will be >> running is more or less a single threated app which takes advantage when >> run under several instances, ie: parallelism, so I thought,

Re: Some performance questions....

2018-03-16 Thread Deepak Goel
>I think there is no benefit in having multiple Solr instances on a single >server, unless the heap memory required by the JVM is too big. Deepak*** I would try multiple Solr instances rather a single Solr instance (it definitely will give a performance boost) Deepak*** >And remember that

Re: statistics in hitlist

2018-03-16 Thread John Smith
Thanks for the link to the documentation, that will probably come in useful. I didn't see a way though, to get my avg function working? So instead of doing a linear regression on two fields, X and Y, in a hitlist, we need to do a linear regression on field X, and the average value of X. Is that

RE: question regarding wildcard-searches

2018-03-16 Thread Paesen Roel
Hi, Unfortunately that also gives no results (and it would not be practical, as for this example the numbering only goes up till 19 but others go up into the thousands etc) Anybody with a pointer on this? Thanks already, Roel -Original Message- From: jagdish vasani

Re: Solr on DC/OS ?

2018-03-16 Thread Søren
Thanks a lot guys. Now we know where to start. Best     Soren On 15-03-2018 09:27, Hendrik Haddorp wrote: Hi, we are running Solr on Marathon/Mesos, which should basically be the same as DC/OS. Solr and ZooKeeper are running in docker containers. I wrote my own Mesos framework that handles

Re: Solr document routing using composite key

2018-03-16 Thread Shawn Heisey
On 3/6/2018 11:53 AM, Nawab Zada Asad Iqbal wrote: I have 117 shards and i tried to use document ids from zero to 116. I find that the distribution is very uneven, e.g., the largest bucket receives total 5 documents; and around 38 shards will be empty. Is it expected? With such a small data

Re: Some performance questions....

2018-03-16 Thread Shawn Heisey
On 3/15/2018 6:34 AM, BlackIce wrote: However the main app that will be running is more or less a single threated app which takes advantage when run under several instances, ie: parallelism, so I thought, since I'm at it I may give solr a few instances as well Solr is a fully threaded app,

Re: Remove Replacement character "�" from the search results

2018-03-16 Thread uttamdhakal
Erick Erickson wrote > This is more likely a problem with your browser's character set, try > setting it to UTF-8. The problem is not with my browser's character set. Anyway I want to remove/replace certain characters from the search result -- Sent from:

RE: SpellCheck Reload

2018-03-16 Thread Sadiki Latty
Thanks Alessandro, I'll give this a try next time. I ended up deleting the spell folder after trying the reload option without success. Next time I will try the reload then build method you suggested. Thanks again for the info. -Original Message- From: Alessandro Benedetti

Re: Using multi valued field in solr cloud Graph Traversal Query

2018-03-16 Thread Jan Høydahl
> Adding multi-value field support is a fairly high priority so I would > expect this to be coming in a future release. I got this question from a client of mine as well. Trying to find a JIRA issue for multi value support, is there one? -- Jan Høydahl, search solution architect Cominvent AS -

Re: question regarding wildcard-searches

2018-03-16 Thread jagdish vasani
Hi paesen, Value - EO.1954.53.1 is indexed as below Eo 1954 53 1 Dot is removed.try with wildcard -? Like EO.1954.53.?? If you have 2 digits only in last.. I have not tried but you just check it. Hope it will solve your problem. Thanks, Jagdish On 16-Mar-2018 3:51 pm, "Paesen Roel"

Re: LTR - OriginalScore query issue

2018-03-16 Thread ilayaraja
Yes, I have tried that too: But it was throwing error while feature extraction: "Exception from createWeight for SolrFeature [name=originalLuceneScore, params={q={!dismax qf=tem_type_all^30.0 ..}${user_query}}] Failed to parse feature query. at

Re: LTR - OriginalScore query issue

2018-03-16 Thread Alessandro Benedetti
I understood your requirement, the SolrFeature feature type should be quite flexible, have you tried : { name: "overallEdismaxScore", class: "org.apache.solr.ltr.feature.SolrFeature", params: { q: "{!dismax qf=item_typel^3.0 brand^2.0 title^5.0}${user_query}" }, store: "myFeatureStoreDemo",

question regarding wildcard-searches

2018-03-16 Thread Paesen Roel
Hi everybody, We are experimenting with solr, and I have a (I think) basic-level question: we have a multiple fields, all copied into a generic field so we can search everything at once. However we have a (for us) strange situation doing wildcard searches for the contents of one specific field.

Re: Apache commons fileupload migration

2018-03-16 Thread padmanabhan1616
Yes I read the changelog 1.3.3. This release contains the security vulnerability fix. DiskDileItem can actually no longer be deserialized, *unless a system property is set to true*. Fixes FILEUPLOAD-279. We don't have security architecture for my product to decide weather it is vulnerable or