Re: How can I set the defaultOperator to be AND?

2016-04-26 Thread Bastien Latard - MDPI AG
Thank you Erick. You're fully right that it can be an expected behavior to get more docs with more words...why not... However, when I set the default OP to "AND" in solrconfig.xml, then a simple query "q=a OR b" doesn't work as expected... as described in the previous email: -> a search

Build Java Package for required schema and solrconfig files field and configuration.

2016-04-26 Thread Nitin Solanki
Hello Everyone, I have created a autosuggest using Solr suggester. I have added a field and field type in schema.xml and did some changes in /suggest request handler into solrconfig.xml. Now, I need to build a java package using those configuration which I need to plug

Re: concat 2 fields

2016-04-26 Thread vrajesh
Hi Jack, as per your explanation i made following changes: id title title title title

Re: Dergraded performance between Solr 4 and Solr 5

2016-04-26 Thread Erick Erickson
Well, the first question is always "how are you measuring this"? Measuring a few queries is almost completely uninformative, especially if the two systems have differing warmups. The only meaningful measurements are when throwing away the first bunch of queries then measuring a meaningful sample.

Re: How can I set the defaultOperator to be AND?

2016-04-26 Thread Erick Erickson
Defaulting to "OR" has been the behavior since forever, so changing the behavior now is just not going to happen. Making it fit a new version of "correct" will change the behavior for every application out there that has not specified the default behavior. There's no a-priori reason to expect

Re: Replicas for same shard not in sync

2016-04-26 Thread Erick Erickson
You left out step 5... leader responds with fail for the update to the client. At this point, the client is in charge of retrying the docs. Retrying will update all the docs that were successfully indexed in the failed packet, but that's not unusual. There's no real rollback semantics that I know

Re: 'batching when indexing is good' -> some questions

2016-04-26 Thread Erick Erickson
These are orthogonal. Confusing I know... In my blog, "batch size" refers to the number of documents sent to _Solr_ when you're indexing, in this case from a SolrJ program but the results generally hold for HTTP requests. The "batch size" you're seeing in DIH is the batch size for getting

Re: Questions on SolrCloud core state, when will Solr recover a "DOWN" core to "ACTIVE" core.

2016-04-26 Thread Erick Erickson
One of the reasons this happens is if you have very long GC cycles, longer than the Zookeeper "keep alive" timeout. During a full GC pause, Solr is unresponsive and if the ZK ping times out, ZK assumes the machine is gone and you get into this recovery state. So I'd collect GC logs and see if you

Re: Tuning solr for large index with rapid writes

2016-04-26 Thread Erick Erickson
If I'm reading this right, you have 420M docs on a single shard? If that's true you are pushing the envelope of what I've seen work and be performant. Your OOM errors are the proverbial 'smoking gun' that you're putting too many docs on too few nodes. You say that the document count is "growing

Re: Child doc facet not getting terms, only counts

2016-04-26 Thread Yangrui Guo
I've finally solved this problem. It appears that I do not need to add the line domain: blockChildren: content_type:c in the subfacet. Now I've got my desired results On Tue, Apr 26, 2016 at 3:14 PM, Yangrui Guo wrote: > The documents are organized in a key-value like

Tuning solr for large index with rapid writes

2016-04-26 Thread Stephen Lewis
Hello, I'm looking for some guidance on the best steps for tuning a solr cloud cluster which is heavy on writes. We are currently running a solr cloud fleet composed of one core, one shard, and three nodes. The cloud is hosted in AWS, and each solr node is on its own linux r3.2xl instance with 8

Re: Child doc facet not getting terms, only counts

2016-04-26 Thread Yangrui Guo
The documents are organized in a key-value like structure { id: 1 product_name: some apparel category: apparel { attribute: brand value: Chanel } { attribute: madein value: Europe } } Because there are indefinite

Re: Child doc facet not getting terms, only counts

2016-04-26 Thread Yonik Seeley
How are the documents indexed? Can you show an example document (with nested documents)? -Yonik On Tue, Apr 26, 2016 at 5:08 PM, Yangrui Guo wrote: > When I use subfaceting with Json API, the facet results only gave me > counts, no terms. My query is like this: > > { >

Re: Questions on SolrCloud core state, when will Solr recover a "DOWN" core to "ACTIVE" core.

2016-04-26 Thread Li Ding
Thank you all for your help! The zookeeper log rolled over, thisis from Solr.log: Looks like the solr and zk connection is gone for some reason INFO - 2016-04-21 12:37:57.536; org.apache.solr.common.cloud.ConnectionManager; Watcher org.apache.solr.common.cloud.ConnectionManager@19789a96

Child doc facet not getting terms, only counts

2016-04-26 Thread Yangrui Guo
When I use subfaceting with Json API, the facet results only gave me counts, no terms. My query is like this: { apparels : { type: terms, field: brand, facet:{ values:{ type: query, q:\"brand:Chanel\", facet: {

Re: Replicas for same shard not in sync

2016-04-26 Thread Jeff Wartes
At the risk of thread hijacking, this is an area where I don’t know I fully understand, so I want to make sure. I understand the case where a node is marked “down” in the clusterstate, but what if it’s down for less than the ZK heartbeat? That’s not unreasonable, I’ve seen some

Re: The Streaming API (Solrj.io) : id must have DocValues?

2016-04-26 Thread Joel Bernstein
My blog is pretty out of date at this point unfortunately. I need to get some better examples published. Also there is huge amount of work that went into Solr 6 Streaming API and Streaming Expressions that make them much easier to work with. In Solr 6.1 you'll be able to test Streaming

RE: Overall large size in Solr across collections

2016-04-26 Thread Allison, Timothy B.
> I can tell you that Tika is quite the resource hog. It is likely chewing up > CPU and memory > resources at an incredible rate, slowing down your Solr server. You > would probably see better performance than ERH if you incorporate Tika > and SolrJ into a client indexing program that runs

Re: The Streaming API (Solrj.io) : id must have DocValues?

2016-04-26 Thread sudsport s
I see that some work was done to remove stream handler form config. so enabling stream handler is still security issue? https://issues.apache.org/jira/browse/SOLR-8262 On Tue, Apr 26, 2016 at 11:14 AM, sudsport s wrote: > I am using solr 5.3.1 server & solr5.5 on client (

Re: The Streaming API (Solrj.io) : id must have DocValues?

2016-04-26 Thread sudsport s
I am using solr 5.3.1 server & solr5.5 on client ( solrj) . I will try with solrj 6.0 On Tue, Apr 26, 2016 at 11:12 AM, Susmit Shukla wrote: > Which solrj version are you using? could you try with solrj 6.0 > > On Tue, Apr 26, 2016 at 10:36 AM, sudsport s

Re: The Streaming API (Solrj.io) : id must have DocValues?

2016-04-26 Thread Susmit Shukla
Which solrj version are you using? could you try with solrj 6.0 On Tue, Apr 26, 2016 at 10:36 AM, sudsport s wrote: > @Joel > >Can you describe how you're planning on using Streaming? > > I am mostly using it for distirbuted join case. We were planning to use > similar

Re: The Streaming API (Solrj.io) : id must have DocValues?

2016-04-26 Thread sudsport s
@Joel >Can you describe how you're planning on using Streaming? I am mostly using it for distirbuted join case. We were planning to use similar logic (hash id and join) in Spark for our usecase. but since data is stored in solr , I will be using solr stream to perform same operation. I have

Dergraded performance between Solr 4 and Solr 5

2016-04-26 Thread Jaroslaw Rozanski
Hi all, I am migrating a large Solr Cloud cluster from Solr 4.10 to Solr 5.5.0 and I observed big difference in query execution time. First a setup summary: - multiple collections - 6 - each has multiple shards - 6 - same/similar hardware - indexing tens of messages per second - autoSoftCommit

Re: The Streaming API (Solrj.io) : id must have DocValues?

2016-04-26 Thread sudsport s
Thanks @Reth yes that was my one of the concern. I will look at JIRA you mentioned. Thanks Joel I used some of examples for streaming client from your blog. I got basic tuple stream working but I get following exception while running parallel string. java.io.IOException:

MoreLikeThis Component - how to get fields of documents

2016-04-26 Thread Dr. Jan Frederik Maas
Hello, I want to use the moreLikeThis Component to get similar documents from a sharded SOLR. This works quite well except for the fact that the documents in the moreLikeThis-list only contain the id/unique key of the documents. Is it possible to get the other fields? I can of course do

Re: concat 2 fields

2016-04-26 Thread Jack Krupansky
As I myself had commented on that grokbase thread so many months ago, there are examples of how to do this is my old Solr 4.x Deep Dive book. If you read the grokbase thread carefully, you will see that you left out the prefix "Custom" in front of "Concat" - this is not a standard Solr feature.

Re: concat 2 fields

2016-04-26 Thread vrajesh
i have tried two methods to define as follow: 1) id id_title title id_title

Re: Amazon CloudSearch

2016-04-26 Thread Sameer Maggon
Hi Sergio, CloudSearch is a Search-as-a-Service that uses SOLR underneath, though they have a proprietary API to interact with it. Both on the document side and query side. It won't give us ability to 'manage' Solr instances or cluster. If you have a use cases where you want to keep on pumping

'batching when indexing is good' -> some questions

2016-04-26 Thread Bastien Latard - MDPI AG
Hi Eric (Erickson) & others, I read your post 'batching when indexing is good '. But I also read this one , which recommend to use batchSize="-1". So I have now some

ANN: Solr puzzle: Magic Date

2016-04-26 Thread Alexandre Rafalovitch
I am doing an experiment in teaching about Solr. I've created a Solr puzzle and want to know whether people would find it useful to do more of these. My mailing list have seen this already, but I would love the feedback from a wider Solr audience as well. Privately or on the list. The - first -

Amazon CloudSearch

2016-04-26 Thread marotosg
Hi, I am evaluating the possibility of using Amazon CloudSearch to manage Solr insances. Reason is the price and time to manage and deploy. I am not fully sure yet how flexible is that service. in case you need to install a specific solr version or plug in. Do you have any experience with it?

Re: Solr Cloud Indexing Performance degrades suddenly

2016-04-26 Thread Reth RM
What are the recent changes made to database or DIH? Version upgrade? Addition of new fields? co-location of db? On Tue, Apr 26, 2016 at 2:47 PM, preeti kumari wrote: > I am using solr 5.2.1 . > > > -- Forwarded message -- > From: preeti kumari

Re: concat 2 fields

2016-04-26 Thread Reth RM
Check if you have added the 'concatFields' definition as well in solrconfig.xml... How are you indexing btw? On Tue, Apr 26, 2016 at 12:24 PM, vrajesh wrote: > Hi, > i have added it to /update request handler as per following in > solrconfig.xml: > > >

HttpSolrClient issue

2016-04-26 Thread srinivasarao vundavalli
I am using HttpSolrClient class to query and fetch the documents from solr index. I am passing my custom httpclient object to HttpSolrClient. HttpSolrClient solrClient = new HttpSolrClient(url, httpClient); This is restricting me setting maximum number of connections using *setMaxTotalConnections

Fwd: Solr Cloud Indexing Performance degrades suddenly

2016-04-26 Thread preeti kumari
I am using solr 5.2.1 . -- Forwarded message -- From: preeti kumari Date: Mon, Apr 25, 2016 at 2:29 PM Subject: Solr Cloud Indexing Performance degrades suddenly To: solr-user@lucene.apache.org Hi, I have 2 solr cloud setups : Primary and secondary.

Indexing performance on HDFS

2016-04-26 Thread KORTMANN Stefan (MORPHO)
Hi, can indexing on HDFS somehow be tuned up using pluggable codecs / some customized PostingsFormat? What settings would you recommend for using Lucene 5.5 on HDFS? Regards, Stefan # " This e-mail and any attached documents may contain confidential or proprietary information. If you are not

RunTimeLib Transformers

2016-04-26 Thread Basel Ariqat
Hi, I want to make my transformers in solr loaded at run time (use .system collection to upload jars), but this feature seems to only work with requesthandlers, responsewriters and other plugins in solrconfig.xml, it doesn't work with anything in data-config.xml probably because it's dependent on

Re: concat 2 fields

2016-04-26 Thread vrajesh
Hi, i have added it to /update request handler as per following in solrconfig.xml: application/json concatFields application/csv concatFields but when i query it after indexing new files, i dont see

Re: How can I set the defaultOperator to be AND?

2016-04-26 Thread Bastien Latard - MDPI AG
Thank you Shawn, Jan and Georg for your answers. Yes, it seems that if I simply remove the defaultOperator it works well for "composed queries" like '(a:x AND b:y) OR c:z'. But I think that the default Operator should/could be the AND. Because when I add an extra search word, I expect that