Re: Multiple unique key in Schema

2015-11-17 Thread Erik Hatcher
Make each document have a composite unique key: user-1, user-2, review-1,... Etc. Easier said than done if you're just posting the CSV directly to Solr but an update script could help. Or perhaps use the UUID auto id feature. Erik > On Nov 17, 2015, at 08:14, Mugeesh Husain

Re: Multiple unique key in Schema

2015-11-17 Thread Alexandre Rafalovitch
When you index into Solr, you are overlapping the definitions into one schema. Therefore, you will need a unified uniqueKey. There is a couple of approaches: 1) Maybe you don't actually store the data as three types of entities. Think about what you will want to find and structure the data to

Multiple unique key in Schema

2015-11-17 Thread Mugeesh Husain
Hi! I have a 3 csv table, 1.)retuarant 2.)User 3.)Review every csv have a unique key, then how i can configure multiple unique key in solr -- View this message in context: http://lucene.472066.n3.nabble.com/Multiple-unique-key-in-Schema-tp4240550.html Sent from the Solr - User mailing list

Re: Query gives response multiple times

2015-11-17 Thread Shane McCarthy
Yes a complex situation and I am having trouble knowing who to ask for an explanation. I have been given access to the Solr Admin page. Using the persistent identifier PID, I have done a query for PID = *. The PID that are found with that query all have cml_hfenergy_md, a multivalued double

Re: search for documents where all words of field present in the query

2015-11-17 Thread Alexandre Rafalovitch
Are you sure your original description is not a reverse of your use-case? Now, it seems like you just want mm=100 which means "samsung" will match all entries, but "samsung 32G" will only match 3 of them. https://wiki.apache.org/solr/DisMaxQParserPlugin#mm_.28Minimum_.27Should.27_Match.29

Re: Best way to backup and restore an index for a cloud setup in 4.6.1?

2015-11-17 Thread KNitin
You can use solrcloud haft : https://github.com/bloomreach/solrcloud-haft We use it in our production against 4.6.1. Nitin On Monday, May 11, 2015, Shalin Shekhar Mangar wrote: > Hi John, > > There are a few HTTP APIs for replication, one of which can let you take a >

Re: search for documents where all words of field present in the query

2015-11-17 Thread Alexandre Rafalovitch
This sounds more like a use case for https://github.com/flaxsearch/luwak Or a variation of Ted Sullivan's work: http://lucidworks.com/blog/author/tedsullivan/ I do not think this can be done in Solr directly. If your matched fields were always 2-tokens, you could do complex mm param. If the

Limiting number of parallel queries per user

2015-11-17 Thread deansg
Hello, My team is trying to write a SearchComponent that will limit the amount of queries a certain user can run in parallel at any given moment. We want to do this to avoid a certain user from slowing Solr down to much. In the search component, we can identify the user sending the request, and

Data Import Handler / Backup indexes

2015-11-17 Thread Brian Narsi
I am using Data Import Handler to retrieve data from a database with full-import, clean = true, commit = true and optimize = true This has always worked correctly without any errors. But just to be on the safe side, I am thinking that we should do a backup before initiating Data Import Handler.

Re: search for documents where all words of field present in the query

2015-11-17 Thread superjim
Thank you so match for answer! I'm check Luwak solution. By business case is very common and simple. 1) user search for products. sample real query: smartphone samsung s3 black 32G 2) i have really big database of products. I want return to user all products from my database: "Samsung s3 32g

Date Math, NOW and filter queries

2015-11-17 Thread Mugeesh Husain
hi!, http://lucidworks.com/blog/2012/02/23/date-math-now-and-filter-queries/ for date range query i am following above article,in this article I try to querying fq=date:[NOW/DAY-7DAYS TO NOW/DAY], it is working fine, when i fire query fq=date:[NOW/DAY-7DAYS TO NOW/DAY+1DAY], it is giving below

Re: search for documents where all words of field present in the query

2015-11-17 Thread superjim
There the same questions I've found in google: Solr query must match all words/tokens in a field http://stackoverflow.com/questions/10508078/solr-query-must-match-all-words-tokens-in-a-field Syntax for query where all words in field must be present in query

Re: Multiple unique key in Schema

2015-11-17 Thread Mugeesh Husain
>>Or perhaps use the UUID auto id feature. if i use UUID, then how i can update particular document, i think using this ,there will not any document identity -- View this message in context: http://lucene.472066.n3.nabble.com/Multiple-unique-key-in-Schema-tp4240550p4240557.html Sent from

Re: Multiple unique key in Schema

2015-11-17 Thread Mugeesh Husain
>>Or perhaps use the UUID auto id feature. if i use UUID, then how i can update particular document, i think using this ,there will not any document identity -- View this message in context: http://lucene.472066.n3.nabble.com/Multiple-unique-key-in-Schema-tp4240550p4240563.html Sent from the

search for documents where all words of field present in the query

2015-11-17 Thread superjim
How would I form a query where all of the words in a field must be present in the query (but possibly more). For example, if I have the following words in a text field: "John Smith" A query for "John" should return no results A query for "Smith" should return no results A query for "John Smith"

Re: DIH Caching w/ BerkleyBackedCache

2015-11-17 Thread Todd Long
Mikhail Khludnev wrote > It's worth to mention that for really complex relations scheme it might be > challenging to organize all of them into parallel ordered streams. This will most likely be the issue for us which is why I would like to have the Berkley cache solution to fall back on, if

Re: CloudSolrClient Connect To Zookeeper with ACL Protected files

2015-11-17 Thread Kevin Lee
Does anyone know if it is possible to set the ACL credentials in CloudSolrClient needed to access a protected resource in Zookeeper? Thanks! > On Nov 13, 2015, at 1:20 PM, Kevin Lee wrote: > > Hi, > > Is there a way to use CloudSolrClient and connect to a Zookeeper

Re: Query gives response multiple times

2015-11-17 Thread Alexandre Rafalovitch
If you have access to the Admin UI, go to the Schema Browser field under the core (I assume Solr 4+ here, never actually asked for your version). https://cwiki.apache.org/confluence/display/solr/Schema+Browser+Screen You can see in the example that when you select a field, it will show whether it

Re: Date Math, NOW and filter queries

2015-11-17 Thread Erick Erickson
Congratulations, you are in "Url escaping hell" ;) the '+' sign is a URL-escape for space, which you see in the error message. Escape it as %2B and you should be fine. Best, Erick On Tue, Nov 17, 2015 at 6:07 AM, Mugeesh Husain wrote: > hi!, > >

Re: solr 5.0 cloud ,leader's load is more higer than others

2015-11-17 Thread Erick Erickson
First of all, why are you using RAMDirectory? This is NOT recommended except for very special circumstances, and it will not increase search speed. Before worrying about the CPU usage on the leader, I'd like to understand why RAMDirectory is in the mix at all. Best, Erick On Tue, Nov 17, 2015

Re: CloudSolrCloud - Commit returns but not all data is visible (occasionally)

2015-11-17 Thread Erick Erickson
That's what was behind my earlier comment about perhaps the call is timing out, thus the commit call is returning _before_ the actual searcher is opened. But the call coming back is not a return from commit, but from Jetty even though the commit hasn't really returned. Just a guess however.

Why can a dynamic field ONLY start or end with '*' but not both?

2015-11-17 Thread Frank Greguska
Hello, Prior to the implementation of SOLR-3251 , it seems it was possible to create dynamic fields using multiple 'glob' characters. e.g. Since this commit

Re: Why can a dynamic field ONLY start or end with '*' but not both?

2015-11-17 Thread Erick Erickson
Starting and ending globs were never officially supported, at least as far back as 3.6. They were never programmatically enforced either apparently. This is from the 3.6 schema.xml: So it was not really discussed that I know about relative to the JIRAs you mentioned, more like you were using

Re: Sold 4.10.4 dropping index on shutdown

2015-11-17 Thread Erick Erickson
Did you commit after indexing and before shutting down? Even if you didn't, I'm still a bit surprised, but that's one possible explanation. But this is the first time I've seen this problem mentioned... Best, Erick On Tue, Nov 17, 2015 at 4:08 AM, Oliver Schrenk wrote: > Hi,

Re: Query gives response multiple times

2015-11-17 Thread Erick Erickson
As far as getting fields back when you specify the "fl" parameter, only _stored_ fields (i.e stored="true" in the schema) are available. As far as your doubled (or more) fields, I'm 99% certain that somehow your input process is doing this (I've seen SQL do "surprising" things for instance). Or

Re: Limiting number of parallel queries per user

2015-11-17 Thread Erick Erickson
This will be hard to do in SolrCloud assuming that the entire cluster is fronted by a load balancer _or_ you're using CloudSolrClient (CloudSolrServer in 4x) because the node that gets the highest-level is distributed against all the Solr nodes. I'd probably go more simply (assuming the above is

Sold 4.10.4 dropping index on shutdown

2015-11-17 Thread Oliver Schrenk
Hi, since we upgraded our cluster from 4.7 to 4.10.4 we are experiencing issues. When shutting down the service (with a confirmed graceful shutdown in the logs), the index is dropped, with only one lonely `segments.gen` file left for each shard and all other files being deleted. There is no

Re: Data Import Handler / Backup indexes

2015-11-17 Thread Brian Narsi
Sorry I forgot to mention that we are using SolrCloud 5.1.0. On Tue, Nov 17, 2015 at 12:09 PM, KNitin wrote: > afaik Data import handler does not offer backups. You can try using the > replication handler to backup data as you wish to any custom end point. > > You can

Re: Date Math, NOW and filter queries

2015-11-17 Thread Shawn Heisey
On 11/17/2015 7:07 AM, Mugeesh Husain wrote: > when i fire query fq=date:[NOW/DAY-7DAYS TO NOW/DAY+1DAY], it is giving > below error > > "fq":"initial_release_date:[NOW/DAY-7DAYS TO NOW/DAY 1DAY]", > "rows":"32"}}, > "error":{ > "msg":"org.apache.solr.search.SyntaxError: Cannot parse

Re: solr 5.0 cloud ,leader's load is more higer than others

2015-11-17 Thread Shawn Heisey
On 11/17/2015 1:52 AM, 初十 wrote: > Hello everyone! > > I use solr 5.0 with the RAMDirectory and a collection with 3 replication > and 1 shard, > > the leader‘s load is more higer than others, is it a bug ? Version 5.2 includes a fix that balances the load better so there is not such an imbalance

Re: Query gives response multiple times

2015-11-17 Thread Shane McCarthy
I forgot to let you know that I am using Solr 4.2. On Tue, Nov 17, 2015 at 2:07 PM, Shane McCarthy wrote: > Thank you for the speedy responses. > > I will give the results I have found based on your comments. > > In the schema.xml there are 19 dynamicField are specified. The

Re: Multiple unique key in Schema

2015-11-17 Thread Erik Hatcher
Fair point indeed. Depends on how your update process works though. One can do the trick of assigning batch numbers to an indexing run and deleting documents that aren’t from that reindexing run for example, so it’s not necessary to overwrite documents to “replace” them per se. Erik

Re: Solr/jetty and datasource

2015-11-17 Thread fabigol
sun.​java.​command start.jar --module=http from interface (properties Java) I want that this line equals sun.​java.​commandstart.jar --module=http,jndi How must i do -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-jetty-and-datasource-tp4240426p4240619.html Sent

Re: Query gives response multiple times

2015-11-17 Thread Shane McCarthy
Thank you for the speedy responses. I will give the results I have found based on your comments. In the schema.xml there are 19 dynamicField are specified. The query and the fields list I want are included in these 19 variables. They are all stored so based on your comment Erick I would assume

Re: Date Math, NOW and filter queries

2015-11-17 Thread Chris Hostetter
: the '+' sign is a URL-escape for space, which you : see in the error message. more specifically, the error indicates that something/somewhere/somehow when you are construction your request to Solr, your HTTP request params are not getting properly escaped -- so the '+' is being sent literlay

Re: Data Import Handler / Backup indexes

2015-11-17 Thread KNitin
afaik Data import handler does not offer backups. You can try using the replication handler to backup data as you wish to any custom end point. You can also try out : https://github.com/bloomreach/solrcloud-haft. This helps backup solr indices across clusters. On Tue, Nov 17, 2015 at 7:08 AM,

Performance testing on SOLR cloud

2015-11-17 Thread Aswath Srinivasan (TMS)
Hi fellow developers, Please share your experience, on how you did performance testing on SOLR? What I'm trying to do is have SOLR cloud on 3 Linux servers with 16 GB RAM and index a total of 2.2 million. Yet to decide how many shards and replicas to have (Any hint on this is welcome too,

Re: Data Import Handler / Backup indexes

2015-11-17 Thread Jeff Wartes
https://github.com/whitepages/solrcloud_manager supports 5.x, and I added some backup/restore functionality similar to SOLR-5750 in the last release. Like SOLR-5750, this backup strategy requires a shared filesystem, but note that unlike SOLR-5750, I haven’t yet added any backup functionality

EdgeNGramFilterFactory not working? Solr 5.3.1

2015-11-17 Thread Daniel Valdivia
Hi, I'm trying to get the EdgeNGramFilterFactory filter to work on a certain field, however after defining the fieldType, creating a field for it and copying the source, this doesn't seem to be working. One catch here, that I'm not sure if it's affecting the outcome is that none of my fields

Re: EdgeNGramFilterFactory not working? Solr 5.3.1

2015-11-17 Thread Alexandre Rafalovitch
Here would be my debugging sequence: 1. Are you actually searching against: dispNamePrefix (and not against the default text field which has its own analyzer stack)? 2. Do you see the field definition in the Schema Browser screen? 3. If you on that screen, click "Load Term Info" do you see the

Re: EdgeNGramFilterFactory not working? Solr 5.3.1

2015-11-17 Thread Daniel Valdivia
Hi Markus, I did, everytime I run this experiment I start from 0 :) However, after the last change I did seems like I forgot to commit and I couldn't get results, so now I have some results. The resolution to this problem was specifying the search in the dispNamePrefix field :O Thanks Markus

RE: Performance testing on SOLR cloud

2015-11-17 Thread Markus Jelsma
Hi - we use the Siege load testing program. It can take a seed list of URL's, taken from actual user input, and can put load in parallel. It won't reuse common queries unless you prepare your seed list appropriately. If your setup achieves the goal your client anticipates, then you are fine.

RE: EdgeNGramFilterFactory not working? Solr 5.3.1

2015-11-17 Thread Markus Jelsma
Hi - the usual suspect is: 'did you reindex?' Not seeing things change after modifying index-time analysis chains means you need to reindex. M. -Original message- > From:Daniel Valdivia > Sent: Wednesday 18th November 2015 0:17 > To:

Re: Date Math, NOW and filter queries

2015-11-17 Thread Mugeesh Husain
thanks all of you, actually the problem was '+' sign is a URL-escape for space, Using %2B instead of + sign, should be fine -- View this message in context: http://lucene.472066.n3.nabble.com/Date-Math-NOW-and-filter-queries-tp4240561p4240675.html Sent from the Solr - User mailing list

RE: Expand Component Fields Response

2015-11-17 Thread Sanders, Marshall (AT - Atlanta)
Well I didn't receive any responses and couldn't find any resources so I created a patch and a corresponding JIRA to allow the ExpandComponent to use the TotalHitCountCollector which will only return the total hit count when expand.rows=0 which more accurately reflected my use case. (We don't

Re: Query gives response multiple times

2015-11-17 Thread Shane McCarthy
Thank you for the resources and all the help. I hope that I clear this up soon. I will ask the Islandora folks for their thoughts. Cheers, Shane On Tue, Nov 17, 2015 at 3:19 PM, Alexandre Rafalovitch wrote: > Add echoParams=all to see what are default and other

Re: Performance testing on SOLR cloud

2015-11-17 Thread Erick Erickson
I wouldn't bother to shard either. YMMV of course, but 2.2M documents is actually a pretty small number unless the docs themselves are huge. Sharding introduces inevitable overhead, so it's usually the last thing you resort to. As far as the number of replicas is concerned, that's strictly a

Re: Split Shards

2015-11-17 Thread Erick Erickson
num_shard is indeed the number of shards created when you create the collection. num_shards is irrelevant to the splitshard command. You can look in your state.json (collectionstate.json in Solr 4x) to find this number. Best, Erick On Tue, Nov 17, 2015 at 2:25 PM, kiyer_adobe

Re: Performance testing on SOLR cloud

2015-11-17 Thread Keith L
to add to Ericks point: It's also highly dependent on the types of queries you expect (sorting, faceting, fq, q, size of documents) and how many concurrent updates you expect. If most queries are going to be similar and you are not going to be updating very often, you can expect most of your

Re: StringIndexOutOfBoundsException using spellcheck and synonyms

2015-11-17 Thread Derek Poh
Hi Any advice how to resolve or workaround to this issue? On 11/17/2015 8:28 AM, Derek Poh wrote: Hi Scott I amusing Solr 4.10.4. On 11/16/2015 10:06 PM, Scott Stults wrote: Hi Derek, Could you please add what version of Solr you see this in? I didn't see a related Jira, so this might

Re: Re: how to join search mutiple collection in sorlcloud

2015-11-17 Thread Paul Blanchaert
When you want the results of 'b' in the results of the join, you'll have to reconsider and merge 'b' into 'a' as suggested by Erick... This because the results of the join are not a combination of the 2 collections (as with " select a*,b.* " ). In the 'search' world, you can look at a join as a fq

Re: Undo Split Shard

2015-11-17 Thread Emir Arnautovic
Hi, You can try manually adjusting cluster state in ZK to include parent shard and exclude splits, reload collection and try split again. Btw. any error in logs when split failed? Thanks, Emir On 17.11.2015 07:08, kiyer_adobe wrote: We had 32 shards of 30GB each. The query performance was

Re: Undo Split Shard

2015-11-17 Thread kiyer_adobe
Thanks Emir. How do I update the cluster state in zk? Is there an API for it? -- View this message in context: http://lucene.472066.n3.nabble.com/Undo-Split-Shard-tp4240508p4240523.html Sent from the Solr - User mailing list archive at Nabble.com.

solr 5.0 cloud ,leader's load is more higer than others

2015-11-17 Thread 初十
Hello everyone! I use solr 5.0 with the RAMDirectory and a collection with 3 replication and 1 shard, the leader‘s load is more higer than others, is it a bug ?

Re: CloudSolrCloud - Commit returns but not all data is visible (occasionally)

2015-11-17 Thread adfel70
Thanks Eric, I'll try to play with the autowarm config. But I have a more direct question - why does the commit return without waiting till the searchers are fully refreshed? Could it be that the parameter waitSearcher=true doesn't really work? or maybe I don't understand something here...

Re: solr 5.0 cloud ,leader's load is more higer than others

2015-11-17 Thread 初十
Is anyone encounter the same problem? and how to solve it! 2015-11-17 16:52 GMT+08:00 初十 : > > Hello everyone! > > I use solr 5.0 with the RAMDirectory and a collection with 3 replication > and 1 shard, > > the leader‘s load is more higer than others, is it a bug ? >

Re: Undo Split Shard

2015-11-17 Thread Jan Høydahl
Stop Solr. Then use zkcli - https://cwiki.apache.org/confluence/display/solr/Command+Line+Utilities You want to first do getfile for state.json, then modify it, then putfile to upload it again. Start Solr -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com > 17. nov.

Re: Re: how to join search mutiple collection in sorlcloud

2015-11-17 Thread soledede_w...@ehsy.com
Yes,Thanks,But It just support (select * from A where A.id in(select id from B where...)) or not I hope It is (select a*,b.* from A a join B b on A.id = B.id) How to merge the result of shards Thanks soledede_w...@ehsy.com From: Paul Blanchaert Date: 2015-11-17 15:57 To: solr-user Subject:

Re: Solr Search: Access Control / Role based security

2015-11-17 Thread Noble Paul
I haven't evaluated manifoldCF for this . However , my preference would be to have a generic mechanism in built into Solr to restrict user access to certain docs based on some field values. Relying on external tools make life complex for users who do not like it. Our strategy is * Provide a

Re: Security Problems

2015-11-17 Thread Noble Paul
The authentication plugin is not expensive if you are talking in the context of admin UI. After all it is used not like 100s of requests per second. The simplest solution would be provide a well known permission name called "admin-ui" ensure that every admin page load makes a call to some

Re: Query gives response multiple times

2015-11-17 Thread Alexandre Rafalovitch
Add echoParams=all to see what are default and other parameters that apply to your request https://wiki.apache.org/solr/CoreQueryParameters#echoParams . Specifically, what the 'fl' setting is. Or try setting 'fl' explicitly and see if the display changes. copyField, The one that I would expect to

Re: Expand Component Fields Response

2015-11-17 Thread Joel Bernstein
Hi Marshall, This sounds pretty reasonable. I should have some to review the patch later in the week. Joel Bernstein http://joelsolr.blogspot.com/ On Tue, Nov 17, 2015 at 3:42 PM, Sanders, Marshall (AT - Atlanta) < marshall.sand...@autotrader.com> wrote: > Well I didn't receive any responses

Re: Undo Split Shard

2015-11-17 Thread kiyer_adobe
Thanks Jan. -- View this message in context: http://lucene.472066.n3.nabble.com/Undo-Split-Shard-tp4240508p4240698.html Sent from the Solr - User mailing list archive at Nabble.com.

Split Shards

2015-11-17 Thread kiyer_adobe
Hi, Understand you provision the number of shards needed when you create the collection using num_shards parameter. Few questions: - Is this only for initial number of shards or would apply when you split the original shard as well? - What happens when the splits go over the number of shards