In Place Updates not work as expected

2018-02-15 Thread mganeshs
All, I have (say 1M, in real time it would be more even) solr documents which has lot of fields and it's bit huge. We have a functionality, where we need to go and update a specific field or add new field in to that document. Since we have to do this for all 1M documents, it's taking up more time

Re: Solr running on Tomcat

2018-02-15 Thread Erick Erickson
Why to you think Solr on Tomcat == scalability? Solr has not been distributed as a war file for some time, see: https://wiki.apache.org/solr/WhyNoWar Just run it as a server. Eventually it won't even use Jetty, but something like Netty etc Best, Erick On Thu, Feb 15, 2018 at 7:54 PM, GVK

Re: Index size increases disproportionately to size of added field when indexed=false

2018-02-15 Thread Erick Erickson
This isn't terribly useful without a similar dump of "the other" index directory. The point is to compare the different extensions some segment where the sum of all the files in that segment is roughly equal. So if you have a listing of the old index around, that would help. bq: We don't have any

Solr running on Tomcat

2018-02-15 Thread GVK Prasad
I read some posts on setting up Solr to Run on Tomcat. But all these posts are about Solr version 4.0 or earlier. I am thinking of hosting Solr on Tomcat for scalability. Any recommendation on this. Prasad --- This email has been checked for viruses by Avast antivirus software.

Solr streaming expression - options for Full Outer Join

2018-02-15 Thread Ganesh Sethuraman
I am using Solr 7.2.1. I would to perform full outer join (emit documents from both left and right and if there are common combine them) with solr streaming decorators on two collections and "update" it to a new destination collection. I see "merge" decorator option exists, but this seems to

RE: Index size increases disproportionately to size of added field when indexed=false

2018-02-15 Thread Howe, David
Hi Erick, I have the full dump of the Solr index file sizes as well if that is of any help. I have attached it below this message. We don't have any deleted docs in our index, as we always build it from a brand new virtual machine with a brand new installation of Solr. The ordering is

Re: Solr performance issue

2018-02-15 Thread Shawn Heisey
On 2/15/2018 2:00 AM, Srinivas Kashyap wrote: > I have implemented 'SortedMapBackedCache' in my SqlEntityProcessor for the > child entities in data-config.xml. And i'm using the same for full-import > only. And in the beginning of my implementation, i had written delta-import > query to index

Re: Reading data from Oracle

2018-02-15 Thread Shawn Heisey
On 2/15/2018 12:34 AM, LOPEZ-CORTES Mariano-ext wrote: > We've done the following test: From a java program, we read chunks of data > from Oracle and inject to Solr (via Solrj). > > The problem : It is really really slow (1'5 nights). > > Is there one faster method to do that ? Are you indexing

RE: Solr search word NOT followed by another word

2018-02-15 Thread Allison, Timothy B.
Nice. Thank you! -Original Message- From: Emir Arnautović [mailto:emir.arnauto...@sematext.com] Sent: Thursday, February 15, 2018 2:19 PM To: solr-user@lucene.apache.org Subject: Re: Solr search word NOT followed by another word Hi, I did not provide the right query. If you query as

Re: Solr search word NOT followed by another word

2018-02-15 Thread Emir Arnautović
Hi, I did not provide the right query. If you query as {!complexphrase df=name}”Leonardo -da -Vinci” all works as expected. This matches all three doc. HTH, Emir -- Monitoring - Log Management - Alerting - Anomaly Detection Solr & Elasticsearch Consulting Support Training -

RE: Solr search word NOT followed by another word

2018-02-15 Thread Allison, Timothy B.
I just updated the SpanQueryParser (LUCENE-5205) and its Solr plugin (SOLR-5410) for master and 7.2.1. What version of Solr are you using and which version of the plugin? These should be available on maven central shortly: version 7.2-0.1 org.tallison.solr solr-5410 7.2-0.1 Or

Re: Issue Using JSON Facet API Buckets in Solr 6.6

2018-02-15 Thread Antelmo Aguilar
Hi, Here are two pastebins. The first is the full complete response with the search parameters used. The second is the stack trace from the logs: https://pastebin.com/rsHvKK63 https://pastebin.com/8amxacAj I am not using any custom code or plugins with the Solr instance. Please let me know

RE: Solr search word NOT followed by another word

2018-02-15 Thread Allison, Timothy B.
I've been away from the ComplexQueryParser for a while, and I was wrong when I said in my earlier email that no currently included Solr parse generates a SpanNotQuery. You're right, Emir, that the ComplexQueryParser does generate a SpanNotQuery, and, y, I just tried this with 7.2.1, and it

Re: Solr performance issue

2018-02-15 Thread Erick Erickson
Srinivas: Not an answer to your question, but when DIH starts getting this complicated, I start to seriously think about SolrJ, see: https://lucidworks.com/2012/02/14/indexing-with-solrj/ IN particular, it moves the heavy lifting of acquiring the data from a Solr node (which I'm assuming also

Re: Reading data from Oracle

2018-02-15 Thread Erick Erickson
Very simple way to know where to start looking: just don't send the docs to Solr. Somewhere you have some code like: SolrClient client = new CloudSolrClient... while (more docs from the DB) { doc_list = build_document_list() client.add(doc_list); } Just comment out the client.add line

Re: Index size increases disproportionately to size of added field when indexed=false

2018-02-15 Thread Erick Erickson
David: Rats, the cfs files make everything I'd hoped to understand with the sizes ambiguous, since they conceal the underlying sizes of each other extension. We can approach it a bit differently though. Take one segment that's _not_ in cfs format where the total size of all files making up that

RE: solr ltr jar is not able to recognize MultipleAdditiveTreesModel

2018-02-15 Thread Brian Yee
I'm not sure if this will solve your problem, but you are using a very old version of Ranklib. The most recent version is 2.9. https://sourceforge.net/projects/lemur/files/lemur/RankLib-2.9/ -Original Message- From: kusha.pande [mailto:kusha.pa...@gmail.com] Sent: Thursday, February

solr ltr jar is not able to recognize MultipleAdditiveTreesModel

2018-02-15 Thread kusha.pande
Hi I am trying to upload a training model generated from ranklib jar using lamdamart mart. The model is like {"class":"org.apache.solr.ltr.model.MultipleAdditiveTreesModel", "name":"lambdamartmodel", "params" : { "trees" :[ { "id": "1", "weight": "0.1", "split": {

Re: facet.method=uif not working in solr cloud?

2018-02-15 Thread Yonik Seeley
On Wed, Feb 14, 2018 at 7:24 PM, Wei wrote: > Thanks Yonik. If uif has big upfront cost when hits solr the first time, > in solr cloud the same faceting request could hit different replicas in the > same shard, so that cost will happen at least for the number of replicas? >

Re: Index size increases disproportionately to size of added field when indexed=false

2018-02-15 Thread Pratik Patel
@Alessandro I will see if I can reproduce the same issue just by turning off omitNorms on field type. I'll open another mail thread if required. Thanks. On Thu, Feb 15, 2018 at 6:12 AM, Howe, David wrote: > > Hi Alessandro, > > Some interesting testing today that

Re: solr read timeout

2018-02-15 Thread Jason Gerlowski
Hi Prateek, Depending on the SolrServer/SolrClient implementation your application is using, you can make use of the "setSoTimeout" method, which controls the socket (read) timeout in milliseconds. e.g.

solr read timeout

2018-02-15 Thread Prateek Jain J
Hi All, I am using solr 4.8.1 in one of our application and sometimes it gives read timeout error. SolrJ is used from client side. How can I increase this default read timeout? Regards, Prateek Jain

Re: Multiple context fields in suggester component

2018-02-15 Thread Alessandro Benedetti
You can start from here : org/apache/solr/spelling/suggest/SolrSuggester.java:265 Cheers - --- Alessandro Benedetti Search Consultant, R Software Engineer, Director Sease Ltd. - www.sease.io -- Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html

Re: Multiple context fields in suggester component

2018-02-15 Thread Renuka Srishti
Thanks Alessandro Benedetti for the response. Can you please share the resources, so that I can explore more about customization of context filter. On Tue, Feb 13, 2018 at 5:01 PM, Alessandro Benedetti wrote: > Simple answer is No. > Only one context field is supported out

Re: Reading data from Oracle

2018-02-15 Thread Bernd Fehling
So it is not SolrJ, but Solr is your problem? In your first email there was nothing about heap exceptions, only the runtime about loading. What do you means by "injecting too many rows", what is "too many"? Some numbers while loading from scratch: - single node 412GB index - 92 fields - 123.6

RE: Index size increases disproportionately to size of added field when indexed=false

2018-02-15 Thread Howe, David
Hi Alessandro, Some interesting testing today that seems to have gotten me closer to what the issue is. When I run the version of the index that is working correctly against my database table that has the extra field in it, the index suddenly increases in size. This is even though the data

Re: Reading data from Oracle

2018-02-15 Thread Michal Hlavac
Did you try to use ConcurrentUpdateSolrClient instead of HttpSolrClient? m. On štvrtok, 15. februára 2018 8:34:06 CET LOPEZ-CORTES Mariano-ext wrote: > Hello > > We have to delete our Solr collection and feed it periodically from an Oracle > database (up to 40M rows). > > We've done the

RE: Index size increases disproportionately to size of added field when indexed=false

2018-02-15 Thread Alessandro Benedetti
@Pratik: you should have investigated. I understand that solved your issue, but in case you needed norms it doesn't make sense that cause your index to grow up by a factor of 30. You must have faced a nasty bug if it was just the norms. @Howe : *Compound File* .cfs, .cfe An optional

RE: Reading data from Oracle

2018-02-15 Thread LOPEZ-CORTES Mariano-ext
Injecting too many rows into Solr throws Java heap exception (Higher memory? We have 8GB per node). Have DIH support for paging queries? Thanks! -Message d'origine- De : Bernd Fehling [mailto:bernd.fehl...@uni-bielefeld.de] Envoyé : jeudi 15 février 2018 10:13 À :

Re: Solr Recommended setup

2018-02-15 Thread Emir Arnautović
Hi Wael, It is hard to give recommendation what to do since every data set and access patterns differ. There are some guidelines that can be followed, but you will need to test to see which setup suites you. I am guessing that you are running Solr in standalone mode. The problem with such

Re: Reading data from Oracle

2018-02-15 Thread Bernd Fehling
And where is the bottleneck? Is it reading from Oracle or injecting to Solr? Regards Bernd Am 15.02.2018 um 08:34 schrieb LOPEZ-CORTES Mariano-ext: > Hello > > We have to delete our Solr collection and feed it periodically from an Oracle > database (up to 40M rows). > > We've done the

Solr performance issue

2018-02-15 Thread Srinivas Kashyap
Hi, I have implemented 'SortedMapBackedCache' in my SqlEntityProcessor for the child entities in data-config.xml. And i'm using the same for full-import only. And in the beginning of my implementation, i had written delta-import query to index the modified changes. But my requirement grew and