Re: Solr cloud and auto shard timeline

2013-03-22 Thread Otis Gospodnetic
Hi, I think there is a mixup here. SolrCloud has the same sharding capabilities as ES at this point, I believe, other than manual moving of shards Mark mentions. Otis -- Solr ElasticSearch Support http://sematext.com/ On Thu, Mar 21, 2013 at 7:08 PM, Jamie Johnson jej2...@gmail.com wrote:

Re: Writing new indexes from index readers slow!

2013-03-22 Thread Otis Gospodnetic
Jed, While this is something completely different, have you considered using SolrEntityProcessor instead? (assuming all your fields are stored) http://wiki.apache.org/solr/DataImportHandler#SolrEntityProcessor Otis -- Solr ElasticSearch Support http://sematext.com/ On Thu, Mar 21, 2013 at

Re: Writing new indexes from index readers slow!

2013-03-22 Thread Jed Glazner
Thanks Otis, I had not considered that approach, however not all of our fields are stored so that's not going to work for me. I'm wondering if its slow because there is just the one reader getting passed to the index writer... I noticed today that the addIndexes method can take an array of

Re: Solr 4.2 - Slave Index version is higher than Master

2013-03-22 Thread Mark Miller
The other odd thing here is that this should not stop replication at all. When the slave is ahead, it will still have it's index replaced. - Mark On Mar 22, 2013, at 1:26 AM, Mark Miller markrmil...@gmail.com wrote: I'm working on testing to try and catch what you are seeing here:

solr 4.1 replcation whole indexs files from leader

2013-03-22 Thread Brad Hill
Hi,  I use solrcloud 4.1.  I start up two solr nodes A and B and then created a new collection using CoreAdmin to A using one shard, so Node A is leader.  Then I index some docs to it. Then I created the same collection using CoreAdmin to B to become a replica. I found that solr will sync all

Re: DocValues and field requirements

2013-03-22 Thread Marcin Rzewucki
Hi Shawn, Thank you for your response. Yes, that's strange. By enabling DocValues the information about missing fields is lost, which changes the way of sorting as well. Adding default value to the fields can change a logic of application dramatically (I can't set default value to 0 for all

RE: Solr 4.2 - Slave Index version is higher than Master

2013-03-22 Thread John, Phil (CSS)
To add to the discussion. We're running classic master/slave replication (not solrcloud) with 1 master and 2 slaves and I noticed the slave having a higher version number than the master the other day as well. In our case, knock on wood, it hasn't stopped replication. If you'd like a copy

Re: Solr cloud and auto shard timeline

2013-03-22 Thread Jamie Johnson
I am sorry for the confusion, I had assumed that there was a way to issue commands to ES to have it change it's current shard layout (i.e. go from 2 to 4 for instance) but on further reading of their documentation I do not see that. That being said is there a timeline on being able to add shards

Re: Don't cache filter queries

2013-03-22 Thread Dotan Cohen
On Thu, Mar 21, 2013 at 6:22 PM, Chris Hostetter hossman_luc...@fucit.org wrote: : Just add {!cache=false} to the filter in your query : (http://wiki.apache.org/solr/SolrCaching#filterCache). ... : I need to use the filter query feature to filter my results, but I : don't want the

Re: Solr 4.2 - Slave Index version is higher than Master

2013-03-22 Thread Bernd Fehling
That issue was already with solr 4.1. http://lucene.472066.n3.nabble.com/replication-problems-with-solr4-1-td4039647.html Nice to know that it is still there in 4.2. With some luck it will make it to 4.2.1 ;-) Regards Bernd Am 21.03.2013 21:08, schrieb Uomesh: Hi, I am seeing an issue

Using Solr For a Real Search Engine

2013-03-22 Thread Furkan KAMACI
If I want to use Solr in a web search engine what kind of strategies should I follow about how to run Solr. I mean I can run it via embedded jetty or use war and deploy to a container? You should consider that I will have heavy work load on my Solr.

RE: Logging inside a custom analyzer

2013-03-22 Thread Gian Maria Ricci
Thanks a lot, it was exactly what I need, sorry for not being so clear with my question :). Gian Maria. -Original Message- From: Erick Erickson [mailto:erickerick...@gmail.com] Sent: Tuesday, March 19, 2013 3:04 PM To: solr-user@lucene.apache.org; alkamp...@nablasoft.com Subject: Re:

Re: SOLR - Documents with large number of fields ~ 450

2013-03-22 Thread Marcin Rzewucki
Hi, I have a collection with more than 4K fields, but mostly Trie*Fields types. It is used for faceting,sorting,searching and statsComponent. It works pretty fine on Amazon 4xm1.large (7.5GB RAM) EC2 boxes. I'm using SolrCloud, multi A-Z setup and ephemeral storage. Index is managed by mmap, 4GB

Re: Solr cloud and auto shard timeline

2013-03-22 Thread Jamie Johnson
Yes Anshum exactly what I was looking for. Is this being targeted in a particular solr release? I see that some of the related issues are targeted for 4.3, is that the goal for this as well? On Fri, Mar 22, 2013 at 8:07 AM, Anshum Gupta ans...@anshumgupta.netwrote: Hi Jamie, There's

Re: Slow queries for common terms

2013-03-22 Thread Jan Høydahl
Hi There might not be a final cure with more RAM if you are CPU bound. Scoring 90M docs is some work. Can you check what's going on during those 15 seconds? Is your CPU at 100%? Try an (foo OR bar OR baz) search which generates 100mill hits and see if that is slow too, even if you don't use

Solr 4.2 replcation whole index files mechanism.

2013-03-22 Thread bradhill99
Hi, I use solrcloud 4.1. I start up two solr nodes A and B and then created a new collection using CoreAdmin to A using one shard, so Node A is leader. Then I index some docs to it. Then I created the same collection using CoreAdmin to B to become a replica. I found that solr will sync all

Urgent:Solr cloud issue

2013-03-22 Thread anuj vats
Hi Shawan, I have seen your post on solr cloude Master-Master configuration on two servers. I have to use the same Solr structure, but from long I am not able to configure it to comunicate between two server, on single server it works fine. Can you pls help me out to provide required config

PatternReplaceFilterFactory -- what does this regex do?

2013-03-22 Thread Eric Wilson
I'm using the Solr Suggester for autocompletion with WFSTLookup suggest component, and a text file with phrases and weights. ( http://wiki.apache.org/solr/Suggester) I found that the following filter made it impossible to match on ampersands. So I removed it. But I'm sure it was there for a

Re: Sort-field for ALL docs in FieldCache for sort queries - OOM on lots of docs

2013-03-22 Thread Per Steffensen
On 3/21/13 10:50 PM, Shawn Heisey wrote: On 3/21/2013 4:05 AM, Per Steffensen wrote: Can anyone else elaborate? How to activate it? How to make sure, for sorting, that sort-field-value for all docs are not read into memory for sorting - leading to OOM when you have a lot of docs? Can this

Re: Sort-field for ALL docs in FieldCache for sort queries - OOM on lots of docs

2013-03-22 Thread Shawn Heisey
On 3/22/2013 8:54 AM, Per Steffensen wrote: Me too. I will find out soon - I hope! But re-indexing is kinda a problem for us, but we will figure out. Any guide to re-index all you stuff anywhere, so I do it the easiest way? Guess maybe there are some nice tricks about steaming data directly

Solr 4.2, reindexing, transaction logs, high memory usage

2013-03-22 Thread Raghav Karol
Dear List, We are using solr-4.2 to build an index of 5M docs each limited to 6K in size. Conceptually we are modelling a stack of documents. Here is a excerpt from our schema.xml dynamicField name=publicationBody_* type=string indexed=false stored=true multiValued=false

Re: Solr 4.2 replcation whole index files mechanism.

2013-03-22 Thread Mark Miller
There are a few things going on here that caused this, all resolved in 4.2 as far as I know. - Mark On Mar 22, 2013, at 3:56 AM, bradhill99 bradhil...@yahoo.com wrote: Hi, I use solrcloud 4.1. I start up two solr nodes A and B and then created a new collection using CoreAdmin to A using

Re: Solr 4.2 - Slave Index version is higher than Master

2013-03-22 Thread Mark Miller
Are you replicating configuration files as well? - Mark On Mar 22, 2013, at 6:38 AM, John, Phil (CSS) philj...@capita.co.uk wrote: To add to the discussion. We're running classic master/slave replication (not solrcloud) with 1 master and 2 slaves and I noticed the slave having a higher

Re: SOLR - Documents with large number of fields ~ 450

2013-03-22 Thread John Nielsen
with the on disk option. Could you elaborate on that? Den 22/03/2013 05.25 skrev Mark Miller markrmil...@gmail.com: You might try using docvalues with the on disk option and try and let the OS manage all the memory needed for all the faceting/sorting. This would require Solr 4.2. - Mark

Re: Solr 4.2 - Slave Index version is higher than Master

2013-03-22 Thread Uomesh
Hi Mrk, I am replicating below config files but not replicating solrconfig.xml. confFiles:schema.xml, elevate.xml, stopwords.txt, mapping-FoldToASCII.txt, mapping-ISOLatin1Accent.txt, protwords.txt, spellings.txt, synonyms.txt also strange I am seeing big Gen difference between Master and

Re: Solr 4.2 - Slave Index version is higher than Master

2013-03-22 Thread Uomesh
Also, I am replicating only on commit and startup. Thanks, Umesh On Fri, Mar 22, 2013 at 11:23 AM, Umesh Sharma uom...@gmail.com wrote: Hi Mrk, I am replicating below config files but not replicating solrconfig.xml. confFiles: schema.xml, elevate.xml, stopwords.txt,

NoSuchMethodError updateDocument

2013-03-22 Thread Furkan KAMACI
I use Solr 4.1.0 and Nutch 2.1, Java 1.7.0_17, Tomcat 7.0, Intellij IDEA 12.with a Centos 6.4 at my 64 bit computer. I run that command succesfully: bin/nutch solrindex http://localhost:8080/solr -index However when I run that command: bin/nutch solrindex http://localhost:8080/solr -reindex I

Re: Solr 4.2 - Slave Index version is higher than Master

2013-03-22 Thread Mark Miller
And your also on 4.2? - Mark On Mar 22, 2013, at 12:41 PM, Uomesh uom...@gmail.com wrote: Also, I am replicating only on commit and startup. Thanks, Umesh On Fri, Mar 22, 2013 at 11:23 AM, Umesh Sharma uom...@gmail.com wrote: Hi Mrk, I am replicating below config files but not

Re: strange behaviour of wordbreak spellchecker in solr cloud

2013-03-22 Thread alxsss
Hello, Further investigation shows the following pattern, for both DirectIndex and wordbreak spellchekers. Assume that in all cases there are spellchecker results when distrib=false In distributed mode (distrib=true) case when matches=0 1. group=true, no spellcheck results 2.

Re: Solr 4.2, reindexing, transaction logs, high memory usage

2013-03-22 Thread Shawn Heisey
On 3/22/2013 9:24 AM, Raghav Karol wrote: We run this index in 8 solr sharded in 8 solr cores on a single host an m2.4xlarge EC2 instances. We do not use zookeeper (because of operational issues on our live indexes) and manage the sharding ourselves. For this index we run with -Xmx30G and

Re: Slow queries for common terms

2013-03-22 Thread Tom Burton-West
Hi David and Jan, I wrote the blog post, and David, you are right, the problem we had was with phrase queries because our positions lists are so huge. Boolean queries don't need to read the positions lists. I think you need to determine whether you are CPU bound or I/O bound.It is possible

Re: DocValues and field requirements

2013-03-22 Thread Chris Hostetter
: Thank you for your response. Yes, that's strange. By enabling DocValues the : information about missing fields is lost, which changes the way of sorting : as well. Adding default value to the fields can change a logic of : application dramatically (I can't set default value to 0 for all :

Re: Can we manipulate termfreq to count as 1 for multiple matches?

2013-03-22 Thread Chris Hostetter
: parameter *omitTermFreqAndPositions* the key thing to remember being: if you use this, then by omiting positions you can no longer do phrase queries. : or you can use a custom similarity class that overrides the term freq and : return one for only that field. :

Re: transientCacheSize not working

2013-03-22 Thread didier deshommes
I've created an issue and patch here that makes it possible to specify transient and loadOnStatup on core creation: https://issues.apache.org/jira/browse/SOLR-4631 On Wed, Mar 20, 2013 at 10:14 AM, didier deshommes dfdes...@gmail.comwrote: Thanks. Is there a way to pass loadOnStartup and/or

Re: how to get term vector information of sepcific word/position in field

2013-03-22 Thread Chris Hostetter
: is there any way, if i can get term vector information of specific word : only, like i can pass the word, and it will just return term position and : frequency for that word only? : : and also if i can pass the position e.g. startPosition=5 and endPosition=10; : then it will return terms,

Re: Solr 4.2 - Slave Index version is higher than Master

2013-03-22 Thread Mark Miller
That was to you Phil. So it seems this is a problem with the configuration replication case I would guess - I didn't really look at that path in the 4.2 fixes I worked on. I did add it to the new testing I'm doing since I've suspected it (it will prompt a core reload that doesn't happen when

RE: strange behaviour of wordbreak spellchecker in solr cloud

2013-03-22 Thread Dyer, James
Alex, I added your comments to SOLR-3758 (https://issues.apache.org/jira/browse/SOLR-3758) , which seems to me to be the very same issue. If you need this to work now and if you cannot devise a fix yourself, then perhaps a workaround is if the query returns with 0 results, re-issue the query

Re: strange behaviour of wordbreak spellchecker in solr cloud

2013-03-22 Thread alxsss
Thanks. I can fix this, but going over code it seems it is not easy to figure out where the whole request and response come from. I followed up SpellCheckComponent#finishStage and found out that SearchHandler#handleRequestBody calls this function. However, which part calls

Re: Did something change with Payloads?

2013-03-22 Thread jimtronic
Ok, this is very bizzare. If I insert more than one document at a time using the update handler like so: [{id:1,foo_ap:bar|50}},{id:2,foo_ap:bar|75}] It actually stores the same payload value 50 for both docs. That seems like a bug, no? There was a core change in 4.1 to how payloads were

Re: overseer queue clogged

2013-03-22 Thread Gary Yngve
Thanks, Mark! The core node names in the solr.xml in solr4.2 is great! Maybe in 4.3 it can be supported via API? Also I am glad you mentioned in other post the chance to namespace zookeeper by adding a path to the end of the comma-delim zk hosts. That works out really well in our situation for

Re: overseer queue clogged

2013-03-22 Thread Mark Miller
On Mar 22, 2013, at 5:54 PM, Gary Yngve gary.yn...@gmail.com wrote: Thanks, Mark! The core node names in the solr.xml in solr4.2 is great! Maybe in 4.3 it can be supported via API? It is with the core admin api - do you mean the collections api? Please make a JIRA for any feature

RE: strange behaviour of wordbreak spellchecker in solr cloud

2013-03-22 Thread Dyer, James
Alex, You may want to move over to the dev user's list now that you're working on code. Or if you would rather not subscribe to the dev-list, add yourself as a watcher to SOLR-3758 and comment further there. This will help us keep track on progress for the issue. The short answer is that in

Re: Did something change with Payloads?

2013-03-22 Thread Mark Miller
On Mar 22, 2013, at 5:54 PM, jimtronic jimtro...@gmail.com wrote: Ok, this is very bizzare. If I insert more than one document at a time using the update handler like so: [{id:1,foo_ap:bar|50}},{id:2,foo_ap:bar|75}] It actually stores the same payload value 50 for both docs. That

doc cache issues... query-time way to bypass cache?

2013-03-22 Thread Gary Yngve
I have a situation we just discovered in solr4.2 where there are previously cached results from a limited field list, and when querying for the whole field list, it responds differently depending on which shard gets the query (no extra replicas). It either returns the document on the limited

Boost query parameter with Lucid parser and using query FunctionQuery

2013-03-22 Thread Miller, Will Jr
I have been playing around with the bq/bf/boost query parameters available in dismax/edismax. I am using the Lucid parser as my default parser for the query. The lucid parser is an extension of the DisMax parser and should contain everything that is available in that parser. My goal is boost

Re: NoSuchMethodError updateDocument

2013-03-22 Thread Furkan KAMACI
I just indicated that JVM parameter: -Dsolr.solr.home=/home/projects/lucene-solr/solr/solr_home solr_home is where is my config files etc. stands. My solr.xml has that lines: cores adminPath=/admin/cores defaultCoreName=collection1 host=${host:} hostPort=${jetty.port:}

Re: Boost query parameter with Lucid parser and using query FunctionQuery

2013-03-22 Thread Jack Krupansky
You'll have to contact Lucid's support for questions about their code. (I've been away from that code too long to recall much about it.) -- Jack Krupansky -Original Message- From: Miller, Will Jr Sent: Friday, March 22, 2013 7:07 PM To: solr-user@lucene.apache.org Subject: Boost

Re: Boost query parameter with Lucid parser and using query FunctionQuery

2013-03-22 Thread Jan Høydahl
Why would you use dismax for the query() when you want to match a simple term to one field? If you share echoParams=all the answer may lie somewhere therein? -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com Solr Training - www.solrtraining.com 23. mars 2013 kl. 00:07

Re: NoSuchMethodError updateDocument

2013-03-22 Thread Jan Høydahl
Are you 100% sure you use the exact jars for 4.1.0 *everywhere*, and that you're not blending older versions from the Nutch distro in your classpath here? Any ideas? BTW: What was your question here regarding Jetty vs Tomcat? -- Jan Høydahl, search solution architect Cominvent AS -

RE: Boost query parameter with Lucid parser and using query FunctionQuery

2013-03-22 Thread Miller, Will Jr
This is the echo params... It looks like it ignores the qf in the FunctionQuery and instead takes the qf of the main query. lst name=params str name=spellchecktrue/str str name=facettrue/str str name=sortscore desc/str str name=facet.limit11/str str

Question on highlighting of external fields

2013-03-22 Thread Jamie Johnson
Some time ago I had worked with a fellow developer to put together an addon to the (then) current Solr Highlighter to support fetching fields from an external source (like a database for instance). The general mechanics seem to work properly but I am seeing issues now where the highlights do not