Re: New replica types

2018-01-02 Thread Shalin Shekhar Mangar
Comments inline: On Wed, Jan 3, 2018 at 11:39 AM, S G wrote: > AFAIK, tlog file is truncated with a hard-commit. > So if the TLOG replica is only pulling the tlog-file, it would become out > of date if it does not pull the full index too. > That means that the TLOG

Re: New replica types

2018-01-02 Thread S G
AFAIK, tlog file is truncated with a hard-commit. So if the TLOG replica is only pulling the tlog-file, it would become out of date if it does not pull the full index too. That means that the TLOG replica would do a full copy every time there is a commit on the leader. PULL replica, by definition

Re: New replica types

2018-01-02 Thread Shawn Heisey
On 1/2/2018 8:02 PM, S G wrote: If the above is incorrect, can someone please point that out? Assuming I have a correct understanding of how the different replica types work, I have some small clarifications. If my understanding is incorrect, I hope somebody will point out my errors. TLOG

New replica types

2018-01-02 Thread S G
Hi, I was excited to see some good work in having more replica types for Solr. However, Solr documentation left me with a few questions. https://lucene.apache.org/solr/guide/7_2/shards-and-indexing-data-in-solrcloud.html#types-of-replicas This is what I could come up with: (Note that each

Re: SolrJ with Async Http Client

2018-01-02 Thread Rick Leir
Agrawal There is good reading on the topic at https://wiki.apache.org/solr/IntegratingSolr Cheers -- Rick On January 2, 2018 10:31:28 AM EST, RAUNAK AGRAWAL wrote: >Hi Guys, > >I am trying to write fully async service where solr calls are also >async. >Just wondering

Re: Solr Issue

2018-01-02 Thread Rick Leir
Lewin Is this not a job for a database like MySQL? Solr is a search engine, which can be used as a DB with some effort. Choose the right tool for the job . Cheers -- Rick On January 2, 2018 4:35:47 PM EST, "Lewin Joy (TMNA)" wrote: >** PROTECTED 関係者外秘 >Hi, > >I am using

Re: Solr Issue

2018-01-02 Thread Erick Erickson
wait, what is it you want to do? Streaming is built expressly to handle very large result sets. It is _not_ really designed to deliver pages at a time. Why do you want to use it at all? Why not just use straight Solr, perhaps with cursorMark if you want to page deeply. The query is something

Re: Solrcloud with Master/Slave

2018-01-02 Thread Shawn Heisey
On 1/2/2018 3:32 PM, Sundaram, Dinesh wrote: > I have spun up single solrcloud node on 2 servers. This makes no sense.  If you have two servers, then you probably have more than a single node. > tried to synch up the data b/w those servers via zookeeper This is not done with zookeeper. 

Re: Integrating Opencart 3.0.2.0 with SOLR 7.1

2018-01-02 Thread Shawn Heisey
On 1/2/2018 12:12 PM, David Taylor wrote: > I am trying to integrate an OC website with SOLR 7.1. in standalone mode. I > have managed to connect to and import from MySQL however, need help putting > the front end together. > > I have tried to following these instructions >

Solrcloud with Master/Slave

2018-01-02 Thread Sundaram, Dinesh
Hi, I have spun up single solrcloud node on 2 servers. tried to synch up the data b/w those servers via zookeeper but didn't work well due to out of memory issues, ensemble issues with multiple ports connectivity. So had to move to Master slave replication b/w those 2 solrcloud nodes. I

Re: Join - Multiple filters

2018-01-02 Thread Shawn Heisey
On 1/2/2018 1:27 PM, Mathieu Larose wrote: > The following query returns p1 (which is expected): > > q={!join fromIndex=child from=p_id_s to=id}y_s:y1 AND z_s:z1 > > > The following query returns nothing (which is not expected): > > q=({!join fromIndex=child from=p_id_s to=id}y_s:y1 AND z_s:z1) I

Solr Issue

2018-01-02 Thread Lewin Joy (TMNA)
** PROTECTED 関係者外秘 Hi, I am using Solr 6.1 and am facing an issue with a complex scenario. Could you help figure out how this can be achieved in Solr? We have items: A, B, C . There will be multiple record entries for each items. For our understanding, let’s say the fields for these records

Re: Using _default configset in standalone mode

2018-01-02 Thread Shawn Heisey
On 1/2/2018 12:55 PM, Alessandro Hoss wrote: > Actually I haven't tried the bin/solr script because I do everything > remotely on Solr. Thanks for the tip, it worked the way I want > (copying the conf to a new folder), but I need to do it through an API > and choosing what configset to copy from.

Re: SOLR 7.2 and LTR

2018-01-02 Thread Dariusz Wojtas
I have created issue SOLR-11809 ( https://issues.apache.org/jira/browse/SOLR-11809) in JIRA and uploaded a minimal working configuration that shows the problem. I hope this will make it easier to verify and find some solution. Best regards, Dariusz Wojtas On Fri, Dec 29, 2017 at 11:35 AM,

Re: SolrJ with Async Http Client

2018-01-02 Thread Gus Heck
It's not very clear (to me) what your use case is, but generally speaking, asynchronous requests can be achieved by using threads/executors/futures (java) or ajax (javascript). The link seems to be a scala project, I'm sure scala has analogous facilities. On Tue, Jan 2, 2018 at 10:31 AM, RAUNAK

Join - Multiple filters

2018-01-02 Thread Mathieu Larose
Hello, Why is q=$JOIN_QUERY parsed differently than q=($JOIN_QUERY) where $JOIN_QUERY is a query with a join? In other words, why does adding enclosing parentheses change the query? Context: There are two cores: parent and child. In the parent core: { "p": "p1", "x_s": "x1" } In the

Re: Scaling issue with Solr

2018-01-02 Thread Shawn Heisey
On 12/27/2017 3:15 PM, Damien Kamerman wrote: > You seem to have the soft and hard commits the wrong way around. Hard > commit is more expensive. That is only the case if the hard commit has openSearcher set to true. It is strongly recommended for ALL users to have openSearcher set to false on

Re: Using _default configset in standalone mode

2018-01-02 Thread Alessandro Hoss
Thanks Shawn, > How are you doing the core create? > You're right, I was using CoreAdmin API. If you use "bin/solr create" Actually I haven't tried the bin/solr script because I do everything remotely on Solr. Thanks for the tip, it worked the way I want (copying the conf to a new folder), but

Integrating Opencart 3.0.2.0 with SOLR 7.1

2018-01-02 Thread David Taylor
Hi all, I am trying to integrate an OC website with SOLR 7.1. in standalone mode. I have managed to connect to and import from MySQL however, need help putting the front end together. I have tried to following these instructions http://blog.e-zest.com/integrate-apache-solr-with-opencart/ but do

Re: Frequency of Full reindex on SolrCloud

2018-01-02 Thread Shawn Heisey
On 1/2/2018 8:41 AM, bhavin v wrote: > How often do I need to run full reindex on SolrCloud? It takes more than 12 > hours for full reindex to run and we run it every night but is it really > necessary to do it as delta runs correctly. > > New data comes in at the rate of 2000 documents on every

Re: Using _default configset in standalone mode

2018-01-02 Thread Shawn Heisey
On 12/27/2017 7:06 AM, Alessandro Hoss wrote: > After reading this > > docs, > I'm trying to achieve the following with version 7.2.0: > >- >When creating a new collection, if you *do not specify

Re: Always use leader for searching queries

2018-01-02 Thread Erick Erickson
First of all, replicas can be off in terms of counts for the soft commit interval. The commits don't all happen on the replicas at the same wall-clock time. Solr promises eventual consistency, in this case NOW-autocommit time. So my first question is whether the replicas in the shard are

Re: Always use leader for searching queries

2018-01-02 Thread Novin Novin
Hi Erick, You are right, it is XY Problem. Allow me to explain best I can, I have two replica of one collection called "Main". When I was using search feature in my application I get two different numFound count. So I start digging after spending 2 3 hours I found the one replica has numFound

Re: delete solr data and index older than 3 days

2018-01-02 Thread Alexandre Rafalovitch
DocExpirationUpdateProcessorFactory may be interesting for you: http://www.solr-start.com/javadoc/solr-lucene/org/apache/solr/update/processor/DocExpirationUpdateProcessorFactory.html Also, if this is a rolling-log with distinct indexing, you could actually do collection aliasing and start the

Re: Frequency of Full reindex on SolrCloud

2018-01-02 Thread Joe Obernberger
Almost never.  I would only run a re-index for newer versions (such as 6.5.2 to 7.2) that have a required feature or schema changes such as changing the type of an existing field (int to string for example).  Not sure what you mean by 'every delta', but I would assume you just mean new data? 

Frequency of Full reindex on SolrCloud

2018-01-02 Thread bhavin v
Hi Guys, How often do I need to run full reindex on SolrCloud? It takes more than 12 hours for full reindex to run and we run it every night but is it really necessary to do it as delta runs correctly. New data comes in at the rate of 2000 documents on every delta per 30 seconds. Total index

Re: Sharding and Replication

2018-01-02 Thread Shawn Heisey
On 12/27/2017 3:02 AM, Gopesh Sharma wrote: > We had two system where we were doing Master Slave Replication, we used to do > delta-import every 24 hours since we did not want the near real-time data. > Now since our data is increasing we thought of adding one more machine to the > master slave

Re: Confusing DocValues documentation

2018-01-02 Thread Shawn Heisey
On 12/22/2017 12:45 PM, Tech Id wrote: > It seems that stored="false" docValues="true" is the default in Solr's > github and the recommended way to go. Like most things in Solr, there's no simple answer.  It depends. For the purposes of information retrieval (not facets, grouping, or sorting),

SolrJ with Async Http Client

2018-01-02 Thread RAUNAK AGRAWAL
Hi Guys, I am trying to write fully async service where solr calls are also async. Just wondering did anyone tried calling solr in non-blocking mode or is there is a way to do it? I have come across one such project but wondering is there anything provided by

Re: Always use leader for searching queries

2018-01-02 Thread Erick Erickson
This seems like an XY problem. You're asking how to do X because you think it will solve problem Y without telling us what Y is. I say this because on the surface this seems to defeat the purpose behind SolrCloud. Why would you want to only make use of one piece of hardware? That will limit your

Re: How to routing document for send to particular shard range

2018-01-02 Thread Erick Erickson
bq: Only thing which we can achieve is , documents will be routed based on the hash values of the field values. Then you have created your collection with compositeID routing or have some other misconfiguration. You _must_ create your collection with "router.name=implicit". Rather than _tell_ us

Re: With 100% CPU usage giving out of memory exception and solr is not responding

2018-01-02 Thread prathap
What is your Xmx? - 20GB * How many documents in your index? 12GB * What is your filterCache size? 512 MB -- Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html

Re: With 100% CPU usage giving out of memory exception and solr is not responding

2018-01-02 Thread prathap
What is your Xmx? 20GB * How many documents in your index? 12GB * What is your filterCache size? 512 MB -- Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html

Re: SolrCloud not able to view cloud page - Loading of "/solr/zookeeper?wt=json" failed (HTTP-Status 500)

2018-01-02 Thread prathap
I am also getting the same error. I am not sure where exactly we need to pass -Djute.buffer arguments. In the C:\\zookeeper\bin path I can find zkCli file which has below information. @echo off REM Licensed to the Apache Software Foundation (ASF) under one or more REM contributor license

Re: SolrCloud not able to view cloud page - Loading of "/solr/zookeeper?wt=json" failed (HTTP-Status 500)

2018-01-02 Thread prathap
We are passing -DhostPort=4040;-DzkClientTimeout=2; in Apache tomcat service batch file. Passing below argument will here will fix the issue? -Djute.maxbuffer=5291220 -- Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html

Re: How to routing document for send to particular shard range

2018-01-02 Thread Susheel Kumar
Hi Ketan, I believe you need multiple shard looking the count 800M. How much will be the index size? Assume it comes out to 400G and assume your VM/machines has 64GB and practically you want to fit your index into memory for each shard... With that I would create 10shards on 10 machines (40 GB

Always use leader for searching queries

2018-01-02 Thread Novin Novin
Hi guys, I am using solr 5.5.4 and same version for solrj. My question is there any way I can tell cloud solr client to use only leader for queries. Thanks in advance. Navin

RE: How to routing document for send to particular shard range

2018-01-02 Thread hemanth
Hi Ketan, I also tried various ways to route documents to different shards based on some routing key value. eg: status: active,inactive and terminated should go to 3 different shards. I tried creating implicit as well as composite id routers. I could not route the documents to the shard I want.