Re: solr spell correction help

2013-04-15 Thread Rohan Thakur
k thanks jack but then why does cattle not giving kettle as suggestions?? On Fri, Apr 12, 2013 at 6:46 PM, Jack Krupansky j...@basetechnology.comwrote: blandars its not giving correction as blender They have an edit distance of 3. Direct Spell is limited to a maximum ED of 2. -- Jack

Re: solr spell correction help

2013-04-15 Thread Rohan Thakur
but jack im not using lavanstine distance measures im using jarowinker distance On Mon, Apr 15, 2013 at 11:50 AM, Rohan Thakur rohan.i...@gmail.com wrote: k thanks jack but then why does cattle not giving kettle as suggestions?? On Fri, Apr 12, 2013 at 6:46 PM, Jack Krupansky

Re: Does solr cloud support rename or swap function for collection?

2013-04-15 Thread Tim Vaillancourt
I added a brief description on CREATEALIAS here, feel free to tweak: http://wiki.apache.org/solr/SolrCloud#Managing_collections_via_the_Collections_API Tim On 07/04/13 05:29 PM, Mark Miller wrote: It's pretty simple - just as Brad said, it's just

how to update document with DIH (FileDataSource)

2013-04-15 Thread Jeong-dae Ha
Hi, all I am trying to index from both DB and file. and informations from DB and file make one document. so I decided update document which I have already indexed from DB. I will use DIH because of millions of files if I find how to update document with DIH. I need your help. Thanks in advance.

Re: Solr using a ridiculous amount of memory

2013-04-15 Thread Toke Eskildsen
On Sun, 2013-03-24 at 09:19 +0100, John Nielsen wrote: Our memory requirements are running amok. We have less than a quarter of our customers running now and even though we have allocated 25GB to the JVM already, we are still seeing daily OOM crashes. Out of curiosity: Did you manage to

Re: Solr using a ridiculous amount of memory

2013-04-15 Thread John Nielsen
Yes and no, The FieldCache is the big culprit. We do a huge amount of faceting so it seems right. Unfortunately I am super swamped at work so I have precious little time to work on this, which is what explains my silence. Out of desperation, I added another 32G of memory to each server and

Re: SolrCloud vs Solr master-slave replication

2013-04-15 Thread Victor Ruiz
Hi Shawn, thank you for your reply. I'll check if network card drivers are ok. About the RAM, the JVM max heap size is currently 6GB, but it never reaches the maximum, tipically the used RAM is not more than 5GB. should I assign more RAM? I've read that excess of RAM assigned could have also a

Re: how to migrate solr 1.4 index to solr 4.2 index

2013-04-15 Thread Montu v Boda
hi right now we have just moved 1.4 indexes to 4.2.1 and apply the test on that Thanks Regards Montu v Boda -- View this message in context: http://lucene.472066.n3.nabble.com/how-to-migrate-solr-1-4-index-to-solr-4-2-index-tp4055531p4055997.html Sent from the Solr - User mailing list

Re: Which tokenizer or analizer should use and field type

2013-04-15 Thread Erick Erickson
try executing these with debug=all and examine the resulting parsed query, that'll show you exactly how the query is parsed. Also, the query language is not strictly boolean, see: http://searchhub.org/2011/12/28/why-not-and-or-and-not/ The first thing I would try would be to parenthesize

Re: Solr Indexing My SQL Timestamp or Date Time field

2013-04-15 Thread Erick Erickson
Solr requires precise date formats, see: http://lucene.apache.org/solr/api-4_0_0-BETA/org/apache/solr/schema/DateField.html Best Erick On Sun, Apr 14, 2013 at 11:43 AM, ursswak...@gmail.com ursswak...@gmail.com wrote: Hi, To index Date in Solr, Date should be in ISO format. Can we index

Test harness can not load existing index data in Solr 4.2

2013-04-15 Thread zhu kane
I'm extending Solr's *AbstractSolrTestCase* for unit testing. I have existing 'schema.xml', 'solrconfig.xml' and index data. I want to start an embedded solr server to load existing collection and its data. Then test searching doc in solr. This way works well in Solr 3.6. However it does not

Re: Solr using a ridiculous amount of memory

2013-04-15 Thread Toke Eskildsen
On Mon, 2013-04-15 at 10:25 +0200, John Nielsen wrote: The FieldCache is the big culprit. We do a huge amount of faceting so it seems right. Yes, you wrote that earlier. The mystery is that the math does not check out with the description you have given us. Unfortunately I am super swamped

Re: Some Questions About Using Solr as Cloud

2013-04-15 Thread Furkan KAMACI
Hi Jack; I see that SolrCloud makes everything automated. When I use SolrCloud is it true that: there may be more than one computer responsible for indexing at any time? 2013/4/15 Jack Krupansky j...@basetechnology.com There are no masters or slaves in SolrCloud - it's fully distributed. Some

Re: SolR InvalidTokenOffsetsException with Highlighter and Synonyms

2013-04-15 Thread Dmitry Kan
Hi, Does it work well, if you remove synonyms with spaces in them, like eighty six ? Dmitry On Fri, Apr 5, 2013 at 3:43 AM, juancesarvillalba juancesarvilla...@gmail.com wrote: Hi I saw some similar problems in other threads but I think that this is a little different and couldn't get any

Re: How do I recover the position and offset a highlight for solr (4.1/4.2)?

2013-04-15 Thread Dmitry Kan
Hi, They are available in the HighlighterComponent. You will need to read the source code. Dmitry On Wed, Mar 27, 2013 at 4:28 PM, Skealler Nametic bchaillou...@gmail.comwrote: Hi, I would like to retrieve the position and offset of each highlighting found. I searched on the internet,

Re: Solr using a ridiculous amount of memory

2013-04-15 Thread John Nielsen
I did a search. I have no occurrence of UnInverted in the solr logs. Another explanation for the large amount of memory presents itself if you use a single index: If each of your clients facet on at least one fields specific to the client (client123_persons or something like that), then your

SolrCloud Leaders

2013-04-15 Thread Furkan KAMACI
Does number of leaders at a SolrCloud is equal to number of shards?

Re: Solr using a ridiculous amount of memory

2013-04-15 Thread Upayavira
Might be obvious, but just in case - remember that you'll need to re-index your content once you've added docValues to your schema, in order to get the on-disk files to be created. Upayavira On Mon, Mar 25, 2013, at 03:16 PM, John Nielsen wrote: I apologize for the slow reply. Today has been

Re: SolrCloud Leaders

2013-04-15 Thread Upayavira
It is supposed to be one leader per shard, yes. Upayavira On Mon, Apr 15, 2013, at 01:21 PM, Furkan KAMACI wrote: Does number of leaders at a SolrCloud is equal to number of shards?

Re: SolrCloud Leaders

2013-04-15 Thread Jack Krupansky
When the cluster is fully operational, yes. But if part of the cluster is down or split and unable to communicate, or leader election is in progress, the actual count of leaders will not be indicative of the number of shards. Leaders and shards are apples and oranges. If you take down a

Re: SolrCloud Leaders

2013-04-15 Thread Furkan KAMACI
Does leaders may response search requests (I mean do they store indexes) at when I run SolrCloud at first and after a time later? 2013/4/15 Jack Krupansky j...@basetechnology.com When the cluster is fully operational, yes. But if part of the cluster is down or split and unable to communicate,

Number of unique terms in a field

2013-04-15 Thread Andreas Hubold
Hi, in previous versions of Solr (at least with 1.4.1) the admin page displayed the number of unique terms in the index / in a field. I cannot find this on the new admin page anymore (Solr 4.0.0). Can somebody please give me a pointer or is this info not available anymore? Thank you, Andreas

Re: SolrCloud Leaders

2013-04-15 Thread Furkan KAMACI
Here writes something: https://support.lucidworks.com/entries/22180608-Solr-HA-DR-overview-3-x-and-4-0-SolrCloud-and says: Both leaders and replicas index items and perform searches. How replicas index items? 2013/4/15 Furkan KAMACI furkankam...@gmail.com Does leaders may response search

Re: Number of unique terms in a field

2013-04-15 Thread Stefan Matheis
Andreas It's still there :) Open the UI, select a core, go to the Schema Browser, select the field from the drop down and click on the Load Term Info Button (right side, below properties analyzer). Then there's a [10] / 20315 Top-Terms row - right hand of the button you've actually clicked

RE: Getting page number of result with tika

2013-04-15 Thread Gian Maria Ricci
Thanks a lot, I'm curious if anyone has this kind of need and tried that old patch to Solr 4+ and got it working. Gian Maria. -Original Message- From: Erick Erickson [mailto:erickerick...@gmail.com] Sent: Saturday, April 13, 2013 3:40 PM To: solr-user@lucene.apache.org; Gian Maria Ricci

Re: Number of unique terms in a field

2013-04-15 Thread Andreas Hubold
Hi Stefan, with Solr 4.0.0 I just get 10 / -1. I just tried it with Solr 4.2.1 and the example application and it seems to work there. Maybe this has been fixed/improved since 4.0.0. Thanks, Andreas Stefan Matheis wrote on 15.04.2013 15:49: Andreas It's still there :) Open the UI, select

Usage of CloudSolrServer?

2013-04-15 Thread Furkan KAMACI
I am reading Lucidworks Solr Guide it says at SolrCloud section: *Read Side Fault Tolerance* With earlier versions of Solr, you had to set up your own load balancer. Now each individual node load balances requests across the replicas in a cluster. You still need a load balancer on the 'outside'

Re: Tokenize on paragraphs and sentences

2013-04-15 Thread Jack Krupansky
Technically, yes, but you would have to do a lot of work yourself. Like, a sentence/paragraph recognizer that inserted sentence and paragraph markers, and a query parser that allows you to do SpanNear and SpanNot (to selectively exclude sentence or paragraph marks based on your granularity of

Re: SolrCloud Leaders

2013-04-15 Thread Jack Krupansky
All nodes are replicas in SolrCloud since there are no masters. It's a fully distributed model. A leader is also a replica. A leader is simply a replica which was elected to be a leader, for now. An hour from now some other replica may be the leader. It is indeed misleading and inaccurate to

Dynamic data model design questions

2013-04-15 Thread Marko Asplund
I'm implementing a backend service that stores data in JSON format and I'd like to provide a search operation in the service. The data model is dynamic and will contain arbitrarily complex object graphs. How do I index object graphs with Solr? Does the data need to be flattened before indexing?

solr tdate field

2013-04-15 Thread hassancrowdc
Hi, I have date field being indexed into solr. in my schema i have the following code for it, field name=createdDate type=date indexed=true stored=true required=true / but in java, i get the following error when i search using solr: java.lang.ClassCastException: java.lang.String cannot be

Re: solr tdate field

2013-04-15 Thread Jack Krupansky
Check your date field type to make sure it really is solr.DateField or solr.TrieDateField Then check whether you have a function query with an ms function that references a non-TrieDateField. -- Jack Krupansky -Original Message- From: hassancrowdc Sent: Monday, April 15, 2013

SolrException parsing error

2013-04-15 Thread Luis Lebolo
Hi All, I'm using Solr 4.1 and am receiving an org.apache.solr.common.SolrException parsing error with root cause java.io.EOFException (see below for stack trace). The query I'm performing is long/complex and I wonder if its size is causing the issue? I am querying via POST through SolrJ. The

3 general questions about SolrCloud

2013-04-15 Thread SuoNayi
Dear list, Sorry for these general questions and I'm really be mess now. 1. What's the model between the master and replicas in one shard? If the replicas are able to catch up with the master when the master receives a update request it will scatter the request to all the active replicas and

Query Parser OR AND and NOT

2013-04-15 Thread Peter Sch�tt
Hallo, I do not really understand the query language of the SOLR-Queryparser. I use SOLR 4.2 und I have nearly 20 sample address records in the SOLR-Database. I only use the q field in the SOLR Admin Web GUI and every other controls on this website is on default. First category:

Re: Query Parser OR AND and NOT

2013-04-15 Thread Roman Chyla
should be: -city:H* OR zip:30* On Mon, Apr 15, 2013 at 12:03 PM, Peter Schütt newsgro...@pstt.de wrote: Hallo, I do not really understand the query language of the SOLR-Queryparser. I use SOLR 4.2 und I have nearly 20 sample address records in the SOLR-Database. I only use the q

Re: solr tdate field

2013-04-15 Thread hassancrowdc
fieldType name=date class=solr.TrieDateField precisionStep=0 positionIncrementGap=0/ this is the date field in my schema.xml and i do not get the second point; how reference a non-TrieDateField. -- View this message in context:

Re: solr tdate field

2013-04-15 Thread Jack Krupansky
Show us the full query URL (at least all the parameters) and the defaults from the request handler in solrconfig. -- Jack Krupansky -Original Message- From: hassancrowdc Sent: Monday, April 15, 2013 12:17 PM To: solr-user@lucene.apache.org Subject: Re: solr tdate field fieldType

Re: Query Parser OR AND and NOT

2013-04-15 Thread Peter Sch�tt
Hallo, Roman Chyla roman.ch...@gmail.com wrote in news:caen8dywjrl+e3b0hpc9ntlmjtrkasrqlvkzhkqxopmlhhfn...@mail.gmail.com: should be: -city:H* OR zip:30* -city:H* OR zip:30* numFound:2520 gives the same wrong result. Another Idea? Ciao Peter Schütt

Re: solr tdate field

2013-04-15 Thread hassancrowdc
query is as following: localhost:8080/solr/collection1/select?wt=jsonomitHeader=truedefType=dismaxrows=11qf=manufacturer%20model%20displayNamefl=idq=samsung and requesthandler: requestHandler name=standard class=solr.StandardRequestHandler default=true / requestHandler name=/update

Re: Query Parser OR AND and NOT

2013-04-15 Thread Chris Hostetter
: Hallo, : I do not really understand the query language of the SOLR-Queryparser. http://www.lucidimagination.com/blog/2011/12/28/why-not-and-or-and-not/ The one comment i would add regarding your specific examples... : (!city:H*) OR zip:30*numFound: 2896 ...you can't have a boolean

Storing Solr Index on NFS

2013-04-15 Thread Ali, Saqib
Greetings, Are there any issues with storing Solr Indexes on a NFS share? Also any recommendations for using NFS for Solr indexes? Thanks, Saqib

Re: Query Parser OR AND and NOT

2013-04-15 Thread Luis Lebolo
What if you try city:(*:* -H*) OR zip:30* Sometimes Solr requires a list of documents to subtract from (think of *:* -someQuery converts to all documents without someQuery). You can also try looking at your query with debugQuery = true. -Luis On Mon, Apr 15, 2013 at 12:25 PM, Peter Schütt

Re: updateLog in Solr 4.2

2013-04-15 Thread Shawn Heisey
On 4/12/2013 7:17 AM, vicky desai wrote: and solr fails to start . However if i add updatelog in my solrconfig.xml it starts. Is the update log parameter mandatory for solr4.2 You are using SolrCloud. SolrCloud requires both updateLog and replication to be enabled. As you probably know,

Re: Storing Solr Index on NFS

2013-04-15 Thread Walter Underwood
On Apr 15, 2013, at 9:40 AM, Ali, Saqib wrote: Greetings, Are there any issues with storing Solr Indexes on a NFS share? Also any recommendations for using NFS for Solr indexes? I recommend that you do not put Solr indexes on NFS. It can be very slow, I measured indexing as 100X slower on

Re: Storing Solr Index on NFS

2013-04-15 Thread Ali, Saqib
Hello Walter, Thanks for the response. That has been my experience in the past as well. But I was wondering if there new are things in Solr 4 and NFS 4.1 that make the storing of indexes on a NFS mount feasible. Thanks, Saqib On Mon, Apr 15, 2013 at 9:47 AM, Walter Underwood

Re: Query Parser OR AND and NOT

2013-04-15 Thread Roman Chyla
Oh, sorry, I have assumed lucene query parser. I think SOLR qp must be different then, because for me it works as expected (our qp parser is identical with lucene in the way it treats modifiers +/- and operators AND/OR/NOT -- NOT must be joining two clauses: a NOT b, the first cannot be negative,

Re: SolrCloud vs Solr master-slave replication

2013-04-15 Thread Shawn Heisey
On 4/15/2013 3:38 AM, Victor Ruiz wrote: About SolrCloud, I know it doesn't use master-slave replication, but incremental updates, item by item. That's why I thought it could work for us, since our bottleneck appear to be the replication cycles. But another point is, if the indexing occurs in

Re: Storing Solr Index on NFS

2013-04-15 Thread Walter Underwood
Solr 4.2 does have field compression which makes smaller indexes. That will reduce the amount of network traffic. That probably does not help much, because I think the latency of NFS is what causes problems. wunder On Apr 15, 2013, at 9:52 AM, Ali, Saqib wrote: Hello Walter, Thanks for

Re: Grouping performance problem

2013-04-15 Thread davidduffett
Agnieszka, Did you find a good solution to your performance problem with grouping? I have an index with 45m records and am using grouping and the performance is atrocious. Any advice would be very welcome! Thanks in advance, David -- View this message in context:

Re: Usage of CloudSolrServer?

2013-04-15 Thread Shawn Heisey
On 4/15/2013 8:05 AM, Furkan KAMACI wrote: My system is as follows: I crawl data with Nutch and send them into SolrCloud. Users will search at Solr. What is that CloudSolrServer, should I use it for load balancing or is it something else different? It appears that the Solr integration in

Re: SolrException parsing error [Solved]

2013-04-15 Thread Luis Lebolo
Sorry, spoke to soon. Turns out I was not sending the query via POST. Changing the method to POST solved the issue. Apologies for the spam! -Luis On Mon, Apr 15, 2013 at 11:47 AM, Luis Lebolo luis.leb...@gmail.com wrote: Hi All, I'm using Solr 4.1 and am receiving an

Re: Dynamic data model design questions

2013-04-15 Thread Shawn Heisey
On 4/15/2013 8:40 AM, Marko Asplund wrote: I'm implementing a backend service that stores data in JSON format and I'd like to provide a search operation in the service. The data model is dynamic and will contain arbitrarily complex object graphs. How do I index object graphs with Solr? Does the

Re: SolrException parsing error

2013-04-15 Thread Shawn Heisey
On 4/15/2013 9:47 AM, Luis Lebolo wrote: Hi All, I'm using Solr 4.1 and am receiving an org.apache.solr.common.SolrException parsing error with root cause java.io.EOFException (see below for stack trace). The query I'm performing is long/complex and I wonder if its size is causing the issue? I

Re: 3 general questions about SolrCloud

2013-04-15 Thread Shawn Heisey
On 4/15/2013 9:58 AM, SuoNayi wrote: 1. What's the model between the master and replicas in one shard? If the replicas are able to catch up with the master when the master receives a update request it will scatter the request to all the active replicas and expect responses before the request get

Re: Spellchecker not working for Solr 4.1

2013-04-15 Thread davers
I am using spellcheck=true when i post the search. ex. solr/productindex/productQuery?q=fuacetspellcheck=true -- View this message in context: http://lucene.472066.n3.nabble.com/Spellchecker-not-working-for-Solr-4-1-tp4055450p4056131.html Sent from the Solr - User mailing list archive at

Trigger documents update in a collection

2013-04-15 Thread Francois Perron
Hi all, I want to use Solr4 as a NoSQL. My 'ideal' workflow is to add/update documents in a collection (NoSQL) and automatically update changes in another collection with more specific search capabilities. The nosql collection will contains all my documents (750M docs). The 'searchable'

Document adds, deletes, and commits ... a question about visibility.

2013-04-15 Thread Shawn Heisey
Simple question first: Is there anything in SolrJ that prevents indexing more than 500 documents in one request? I'm not aware of anything myself, but a co-worker remembers running into something, so his code is restricting them to 490 docs. The only related limit I'm aware of is the POST

Re: Document adds, deletes, and commits ... a question about visibility.

2013-04-15 Thread Michael McCandless
At the Lucene level, you don't have to commit before doing the deleteByQuery, i.e. 'a' will be correctly deleted without any intervening commit. Mike McCandless http://blog.mikemccandless.com On Mon, Apr 15, 2013 at 3:57 PM, Shawn Heisey s...@elyograg.org wrote: Simple question first: Is there

Re: Trigger documents update in a collection

2013-04-15 Thread Otis Gospodnetic
Hi, Doable with a custom Update Request Processor, yes. Otis Solr ElasticSearch Support http://sematext.com/ On Apr 15, 2013 3:14 PM, Francois Perron francois.per...@wantedanalytics.com wrote: Hi all, I want to use Solr4 as a NoSQL. My 'ideal' workflow is to add/update documents in a

Re: SolR InvalidTokenOffsetsException with Highlighter and Synonyms

2013-04-15 Thread juancesarvillalba
Hi, Before I had a different configuration that was working but with Synonyms in Query time. Now I have a requirement to add multi-word synonyms is for that I am checking this configuration. It doesn't work with this configuration still without multi-words synonyms. The problem happens only

Re: Storing Solr Index on NFS

2013-04-15 Thread Tim Vaillancourt
If centralization of storage is your goal by choosing NFS, iSCSI works reasonably well with SOLR indexes, although good local-storage will always be the overall winner. I noticed a near 5% degredation in overall search performance (casual testing, nothing scientific) when moving a 40-50GB

Re: using maven to deploy solr on tomcat

2013-04-15 Thread Shawn Heisey
On 4/15/2013 2:33 PM, Adeel Qureshi wrote: Environment name=solr/home override=true type=java.lang.String value=src/main/resources/solr-dev/ but this leads to absolute path of INFO: Using JNDI solr.home: src/main/resources/solr-dev INFO: looking for solr.xml:

Re:Re: 3 general questions about SolrCloud

2013-04-15 Thread SuoNayi
Thanks for clarification and I think I did make it clear. At 2013-04-16 01:59:59,Shawn Heisey s...@elyograg.org wrote: On 4/15/2013 9:58 AM, SuoNayi wrote: 1. What's the model between the master and replicas in one shard? If the replicas are able to catch up with the master when the master

Push/pull model between leader and replica in one shard

2013-04-15 Thread SuoNayi
Hi, can someone explain more details about what model is used to sync docs between the lead and replica in the shard? The model can be push or pull.Supposing I have only one shard that has 1 leader and 2 replicas, when the leader receives a update request, does it will scatter the request to

Re: SolR InvalidTokenOffsetsException with Highlighter and Synonyms

2013-04-15 Thread Dmitry Kan
Do you use the standard highlighter or FastVectorHighlighter / PhraseHighlighter ? Do you use hl.highlightMultiTermhttp://wiki.apache.org/solr/HighlightingParameters#hl.highlightMultiTerm option? On Tue, Apr 16, 2013 at 2:51 AM, juancesarvillalba juancesarvilla...@gmail.com wrote: Hi,