Solr Merge Index

2016-07-06 Thread Kalpana
Hello I have two sources - Sitecore web index (core 1) and a database table (core 2). I have created core 3 which is a merge of core1 and core 2. http://localhost:8983/solr/admin/cores?action=mergeindexes=core3=sitecore_web_index=core2 But when someone publishes a page on Sitecore, the

Re: Shard vs Replica

2016-07-06 Thread Susheel Kumar
To understand shard & replica, let's first understand what is sharding and why it is needed. Sharding - Assume your index grows large that it doesn't fit into a single machine (for e.g. your index size is 80GB and your machine is 64GB in which case index won't fit into memory). Now to get

Re: Shard vs Replica

2016-07-06 Thread Anshum Gupta
A collection in SolrCloud is a logical entity that encapsulates documents that confirm to a shared schema. As a distributed system, the data needs to be split and so the collection is logically split into 'Shards'. Shard(s): * don't represent a physical index. * are logical entities Replica: *

Shard vs Replica

2016-07-06 Thread John Doe
Hey, I have have the same question on freenode channel , people answered me , but I believe that I still got doubts. Just because I never had approach to such data store technologies before it makes me hardly understand what is exactly is replica and shard in solr. I believe once I

Re: SolrCloud Node fails that was hosting replicas of a collection

2016-07-06 Thread Erick Erickson
I'd try to figure out what was up with my installation that using the collections API fails rather than continue down the track of using the core admin API. bq: However, since the newly added Sold node is down, it throws Exception " One of the nodes is down" and hence the operation fails. This

Re: CloudSolrServer with multiple zookeeper cluster setup.

2016-07-06 Thread Erick Erickson
CloudSolrServer (and I'm assuming you're in the 4x code line, that's now CloudSolrClient) takes an ensemble, here are two examples: "host1:2181,host2:2181,host3:2181/mysolrchroot" "zoo1.example.com:2181,zoo2.example.com:2181,zoo3.example.com:2181" Best, Erick On Wed, Jul 6, 2016 at 11:48 AM,

RE: Solr more like this

2016-07-06 Thread Jamal, Sarfaraz
Could you index it, do the 'like this' and then delete it from the index? All in one smooth user experience obviously. (Just throwing it out there). Sas -Original Message- From: Charlie Hull [mailto:char...@flax.co.uk] Sent: Wednesday, July 6, 2016 11:02 AM To:

Re: deploy solr on cloud providers

2016-07-06 Thread Tomás Fernández Löbbe
On Wed, Jul 6, 2016 at 2:30 AM, Lorenzo Fundaró < lorenzo.fund...@dawandamail.com> wrote: > On 6 July 2016 at 00:00, Tomás Fernández Löbbe > wrote: > > > The leader will do the replication before responding to the client, so > lets > > say the leader gets to update it's

Re: Full re-index without downtime

2016-07-06 Thread Jeff Wartes
A variation on #1 here - Use the same cluster, create a new collection, but use the createNodeSet option to logically partition your cluster so no node has both the old and new collection. If your clients all reference a collection alias, instead of a collection name, then all you need to do

CloudSolrServer with multiple zookeeper cluster setup.

2016-07-06 Thread Naveen Pajjuri
Hi, In our production we have a solr cloud setup with zookeeper cluster setup. I want to shift to CloudSolrServer from httpsolrserver is there any way to specify all the ip addresses of zookeeper machines while instantiating CloudSolrServer, so that i will have an automatic fallback mechanism. PS

Re: SolrCloud Node fails that was hosting replicas of a collection

2016-07-06 Thread Deeksha Sharma
Hi Erick, Thanks for your reply, but I did used the ADD REPLICA api at the first. However, since the newly added Sold node is down, it throws Exception " One of the nodes is down" and hence the operation fails. I noticed that the node becomes GREEN only after adding the core via Admin UI.

Re: Post filter with boolean query

2016-07-06 Thread Erick Erickson
Not from the response, but a simple approach in a controlled environment, i.e. one where only _you_ are querying the index and there is no indexing going on is to look at the admin UI>>your_core>>plugins/stats>>caches>>filterCache. You'll see the insertions/deletions/hits count. So, after a

Re: Full re-index without downtime

2016-07-06 Thread Steven Bower
There are two options as I see it.. 1. Do something like you describe and create a secondary index, index into it, then switch... I personally would create a completely separate solr cloud alongside my existing one vs new core in the same cloud as you might see some negative impacts on GC caused

Full re-index without downtime

2016-07-06 Thread Steven White
Hi everyone, In my environment, I have use cases where I need to fully re-index my data. This happens because Solr's schema requires changes based on changes made to my data source, the DB. For example, my DB schema may change so that it now has a whole new set of field added or removed (on

Re: Solr more like this

2016-07-06 Thread Alessandro Benedetti
So, if you already indexed N pdfs you can use the MLT request handler to look for similar documents to a specific text. This means you need to manually extract the content from the pdf ( like using Tika for example) and then passing that to the specific request handler.

Post filter with boolean query

2016-07-06 Thread Vasu Y
Hi, I am trying to apply a filter by specifying cost=100, but I would like to use boolean query on lines described below and it seems to be filtering documents; not sure if it's doing pre-filtering or post-filtering. But, I saw some old post that "if the query inside the filter does not have a

Re: help: Solr greek insensitive regex phrase query search

2016-07-06 Thread Valentina Cavazza
Thanks for the answer, on analysis page i see that solr ignore tags so simbols like <>='# and that treat like words (i use StandardTokenizerFactory) so it do not matter if i only have to search in the field: xml:lang='grc-Grek'>βίβλος i can use a query like this: "w ana n βιβλος"~3 but if i

Re: Getting a hit on "the}" but not on "the" or "}"

2016-07-06 Thread Erick Erickson
Yes and No. WDFF does, indeed, break things up. But they're also sequential and you can often get what you want via phrase searches. But what you have now puts junk in your index. What use is "the}" as a single token? It's up to you, but consider cleaning that sort of stuff up with, say a regex

Re: help: Solr greek insensitive regex phrase query search

2016-07-06 Thread Erick Erickson
What do you see if you use the admin/analysis page? That should give you a clue what's happening here Best, Erick On Wed, Jul 6, 2016 at 7:04 AM, Valentina Cavazza wrote: > We created a new field type, this field type is used for a sentence that > contains text in

Re: Solr more like this

2016-07-06 Thread Charlie Hull
On 05/07/2016 19:42, sara hajili wrote: Hi I indexed pdf files yo solr.and now I wanna to know is there any way to uplaod a pdf file and solr return related pdf in result? I mean I don't want to index pdf file (the file that I wanna to get pdf more like this for this pdf).and just upload pdf

help: Solr greek insensitive regex phrase query search

2016-07-06 Thread Valentina Cavazza
We created a new field type, this field type is used for a sentence that contains text in latin and old greek language the text can include greek words with accents we want to be able to do an accent insensitive search so for example: if i search the word βιβλος i want to find in the text the

RE: Access Solr via Apache's mod_proxy_balancer or mod_jk (AJP)

2016-07-06 Thread Davis, Daniel (NIH/NLM) [C]
Again I have to insert the larger company view: * if your company is largish, you may have a load balancer hardware already in use by systems. * If you are using a Cloud system for the Solr, then you can probably use a load balancer provided by the cloud provider, and this may be cheaper

Re: Access Solr via Apache's mod_proxy_balancer or mod_jk (AJP)

2016-07-06 Thread Shawn Heisey
On 7/4/2016 9:53 AM, Shawn Heisey wrote: > On 7/4/2016 3:54 AM, Andreas Kahl wrote: >> Hello everyone, we've setup two Solr servers (not SolrCloud) which >> shall be accessed via Apache webserver's load balancing (either >> mod_proxy_balancer or mod_jk). 1. Is it possible to configure Solr >5 >>

Re: Getting a hit on "the}" but not on "the" or "}"

2016-07-06 Thread Steven White
Thanks Erick. Moving stopword factory to after WDFF fixed the problem; I no longer get a hit on "the}" or the variations of "the]", "the.", etc., I did not have to change preserverOriginal from 1 to 0. Regarding preserverOriginal in WDFF, I have it set to 1 because my understanding of it means

Re: Suggester Issue

2016-07-06 Thread Rajesh Kapur
Hi I have reinstalled the solr and it is working fine now. Couple of issues I am facing now are: 1. Cfq parameter is expecting only brand-name as configured. So now if I hv special character like - in brandname I am not able to fetch the suggestion 2 I want to fetch the suggestion filtered on 2

sorlcloud connection issue

2016-07-06 Thread Kent Mu
Hi friends! *solr version: 4.9.0* I came across a problem when use solrcloud, it becomes dead lock, we got the java core log, it looks like the http connection pool is exhausted and most threads are waiting to get a free connection.. I posted the problem in JIRA, the link is

"Block join faceting is allowed with ToParentBlockJoinQuery only"

2016-07-06 Thread Sebastian Riemer
Hi, Please consider the following three queries: (1)this works: { "responseHeader": { "status": 0, "QTime": 5, "params": { "q": "(type_s:wemi AND {!parent which='type_s:wemi'v='-type_s:wemi AND (((text:(Moby*'})", "facet.field": "m_mainAuthority_s",

Re: Suggester Issue

2016-07-06 Thread Alessandro Benedetti
>From a brief look to the config files it seems fine to me, but I didn't have the chance to try it. Have you checked that you have nothing in the Solr logs ? You can potentially need to debug. Cheers On Tue, Jul 5, 2016 at 10:10 AM, Rajesh Kapur wrote: > PFA the

Re: deploy solr on cloud providers

2016-07-06 Thread Lorenzo Fundaró
On 6 July 2016 at 00:00, Tomás Fernández Löbbe wrote: > The leader will do the replication before responding to the client, so lets > say the leader gets to update it's local copy, but it's terminated before > sending the request to the replicas, the client should get

RE: IO Exception : Truncated chunk for WORKER collection for paraller stream Join Query

2016-07-06 Thread Roshan Kamble
Error is observed on WORKER collection. Below is the error coming at solr instance. ERROR (qtp796684896-894) [c:WORKER s:shard1 r:core_node6 x:WORKER_shard1_replica2] o.a.s.s.HttpSolrCall null:org.apache.http.TruncatedChunkException: Truncated chunk ( expe cted size: 32768; actual size: 28568)

Antw: RE: Access Solr via Apache's mod_proxy_balancer or mod_jk (AJP)

2016-07-06 Thread Andreas Kahl
Thanks, Shawn and Daniel for your feedback. We will consider that and see what fits best into our environment. Regards Andreas >>> "Davis, Daniel (NIH/NLM) [C]" 05.07.16 19.36 Uhr >>> Because access to Solr is typically to an API, rather than to webapps having images