How to reindex in solr

2011-12-01 Thread Kashif Khan
Hi all, I have my solr indexed completely and now i have added a new field in the schema which is a copyfield of another field. Please suggest me how can i reindex solr without going through formal process which i did for the first time because there are some fields whose data is really time

Re: Seek past EOF

2011-12-01 Thread Ruben Chadien
We are using ext3 on Debian. Noticed today that i only need to reload the core to get it working again…. On 30 November 2011 19:59, Simon Willnauer simon.willna...@googlemail.comwrote: can you give us some details about what filesystem you are using? simon On Wed, Nov 30, 2011 at 3:07 PM,

Problem with hunspell french dictionary

2011-12-01 Thread Nathan Castelein
Hi, I'm trying to add the HunspellStemFilterFactory to my Solr project. I'm trying this on a fresh new download of Solr 3.5. I downloaded french dictionary here (found it from here http://wiki.services.openoffice.org/wiki/Dictionaries#French_.28France.2C_29):

Re: mysolr python client

2011-12-01 Thread Alejandro Gonzalez
sounds great for a python project i'm involved in rigth now. I'll take a deeper look on it. thx marco 2011/11/30 Marco Martinez mmarti...@paradigmatecnologico.com Hi all, For anyone interested, recently I've been using a new Solr client for Python. It's easy and pretty well documented. If

Re: mysolr python client

2011-12-01 Thread Jens Grivolla
On 11/30/2011 05:40 PM, Marco Martinez wrote: For anyone interested, recently I've been using a new Solr client for Python. It's easy and pretty well documented. If you're interested its site is: http://mysolr.redtuna.org/ Do you know what advantages it has over pysolr or solrpy? On the page

Re: Weird docs-id clustering output in Solr 1.4.1

2011-12-01 Thread Vadim Kisselmann
Hi Stanislaw, did you already have time to create a patch? If not, can you tell me please which lines in which class in source code are relevant? Thanks and regards Vadim Kisselmann 2011/11/29 Vadim Kisselmann v.kisselm...@googlemail.com Hi, the quick and dirty way sound good:) It would be

Re: Problem with hunspell french dictionary

2011-12-01 Thread Chris Male
There seems that theres a problem with the code parsing the Dictionary. Can you open a JIRA issue with the same information so we can look into fixing it? On Thu, Dec 1, 2011 at 10:14 PM, Nathan Castelein nathan.castel...@gmail.com wrote: Hi, I'm trying to add the HunspellStemFilterFactory

Error in New Solr version

2011-12-01 Thread Pawan Darira
Hi I am migrating from Solr 1.4 to Solr 3.2. I am getting below error in my logs org.apache.solr.common.SolrException: Error loading class 'org.apache.solr.handler.component.CollapseComponent Could not found satisfactory solution on google. please help thanks Pawan

Re: Error in New Solr version

2011-12-01 Thread Vadim Kisselmann
Hi, comment out the lines with the collapse component in your solrconfig.xml if not need it. otherwise, you're missing the right jar's for this component, or path's to this jars in your solrconfig.xml are wrong. regards vadim 2011/12/1 Pawan Darira pawan.dar...@gmail.com Hi I am migrating

Re: make fuzzy search for phrase

2011-12-01 Thread meghana
any solutions?? i am just get stuck in this. :( -- View this message in context: http://lucene.472066.n3.nabble.com/make-fuzzy-search-for-phrase-tp3542079p3551203.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: mysolr python client

2011-12-01 Thread Marc SCHNEIDER
Hi Marco, Great! Maybe you can add it on the Solr wiki? ( http://wiki.apache.org/solr/IntegratingSolr). Regards, Marc. On Thu, Dec 1, 2011 at 10:42 AM, Jens Grivolla j+...@grivolla.net wrote: On 11/30/2011 05:40 PM, Marco Martinez wrote: For anyone interested, recently I've been using a new

Re: mysolr python client

2011-12-01 Thread Marco Martinez
Done! Marco Martínez Bautista http://www.paradigmatecnologico.com Avenida de Europa, 26. Ática 5. 3ª Planta 28224 Pozuelo de Alarcón Tel.: 91 352 59 42 2011/12/1 Marc SCHNEIDER marc.schneide...@gmail.com Hi Marco, Great! Maybe you can add it on the Solr wiki? (

Re: Solr and Ping PHP

2011-12-01 Thread akopov
Hi, I know it's been a while since you posted this question but I'm experiencing the same problem with my instance of Solr (sometimes ping returns false for no visible reason) and I just wonder if you found the solution. Thank you. -- View this message in context:

Re: Weird docs-id clustering output in Solr 1.4.1

2011-12-01 Thread Stanislaw Osinski
Hi Vadim, I've had limited connectivity, so I couldn't check out the complete 1.4.1 code and test the changes. Here's what you can try: In this file:

Re: make fuzzy search for phrase

2011-12-01 Thread Erick Erickson
What did you do to install it? What code line did you start from? 1.4 Solr? 3.1? Fresh trunk update? What jar? The usual method of applying a patch is to get the entire source tree, apply the patch and then re-compile all of solr. Perhaps this page would help:

highlight issue

2011-12-01 Thread Radha Krishna Reddy
Hi, I am indexing around 2000 names using solr. highlight flag is on while querying. For some name i am getting the search substring appened at the start. Suppose my search query is *Rak*.In my database i have *Rakesh Chaturvedi * name. I am getting *emRak/ememRak/emesh Chaturvedi* as the

Re: when using group=true facet numbers are incorrect

2011-12-01 Thread O. Klein
https://issues.apache.org/jira/browse/SOLR-2898 has been created for this. Thanx Martijn! -- View this message in context: http://lucene.472066.n3.nabble.com/when-using-group-true-facet-numbers-are-incorrect-tp3488605p3551741.html Sent from the Solr - User mailing list archive at Nabble.com.

(fq=field1:val1 AND field2:val2) VS fq=field1:val1fq=field2:val2 and filterCache

2011-12-01 Thread Antoine LE FLOC'H
Hello, Is there any difference in the way things are stored in the filterCache if I do (fq=field1:val1 AND field2:val2) or fq=field1:valfq=field2:val2 eventhough these are logically identical ? What get stored exactly ? Also can you point me to where in the Solr source code this processing

Configuring the Distributed

2011-12-01 Thread Jamie Johnson
I am currently looking at the latest solrcloud branch and was wondering if there was any documentation on configuring the DistributedUpdateProcessor? What specifically in solrconfig.xml needs to be added/modified to make distributed indexing work?

Re: (fq=field1:val1 AND field2:val2) VS fq=field1:val1fq=field2:val2 and filterCache

2011-12-01 Thread Tanguy Moal
Hello, Quoting http://wiki.apache.org/solr/SolrCaching#filterCache : The filter cache stores the results of any filter queries (fq parameters) that Solr is explicitly asked to execute. (Each filter is executed and cached separately. When it's time to use them to limit the number of results

Re: highlight issue

2011-12-01 Thread Koji Sekiguchi
Suppose my search query is *Rak*.In my database i have *Rakesh Chaturvedi * name. I am getting *emRak/ememRak/emesh Chaturvedi* as the response. Same the case with the following names. Search Dhar -- highlight emDhar/ememDhar/emmesh Darshan Search Suda-- highlight

Solr cache size information

2011-12-01 Thread elisabeth benoit
Hello, If anybody can help, I'd like to confirm a few things about Solr's caches configuration. If I want to calculate cache size in memory relativly to cache size in solrconfig.xml For Document cache size in memory = size in solrconfig.xml * average size of all fields defined in fl parameter

Re: Weird docs-id clustering output in Solr 1.4.1

2011-12-01 Thread Vadim Kisselmann
Hi Stanislaw, unfortunately it doesn't work. I changed the line 216 with the new toString()-part and rebuild the source. still the same behavior, without errors(because of changes). an another line to change? Thanks and regards Vadim 2011/12/1 Stanislaw Osinski

switching on hl.requireFieldMatch reducing highlighted fields returned

2011-12-01 Thread Robert Brown
I have a query which is highlighting 3 snippets in 1 field, and 1 snippet in another field. By enabling hl.requireFieldMatch, only the latter highlighted field is returned. from this... lst name=highlighting lst name=348231 arr name=content_stemmed str plc Whetstone Temporary

Re: mysolr python client

2011-12-01 Thread Rubén Abad
Hi Jens, Our objective with mysolr was to create a pythonic Apache Solr binding. But we also have been working in speed and concurrency. We always use the Python QueryResponseWriter, because it avoids us dependencies (a XML or JSON parser). We would also like to create a complete concurrent API,

DataImportHandler w/ multivalued fields

2011-12-01 Thread Briggs Thompson
Hello Solr Community! I am implementing a data connection to Solr through the Data Import Handler and non-multivalued fields are working correctly, but multivalued fields are not getting indexed properly. I am new to DataImportHandler, but from what I could find, the entity is the way to go for

spatial search or null

2011-12-01 Thread dan whelan
Hi, how would I go about constructing a solr 3.2 spatial query that would return documents that are in a specified radius OR documents that have no location information. The query would have a similar result as this: q=City:San Diego OR -City:['' TO *] Thanks

RE: Solr cache size information

2011-12-01 Thread Andrew Lundgren
For Filter cache size in memory = size in solrconfig.xml * WHAT (the size of an id) ??? (I don't use facet.enum method) As I understand it, size is the number queries that will be cached. My short experience means that the memory consumed will be data dependent. If you have a huge

Re: spatial search or null

2011-12-01 Thread Rob Brown
Recently had this myself... http://wiki.apache.org/solr/SpatialSearch#How_to_combine_with_a_sub-query_to_expand_results -- IntelCompute Web Design and Online Marketing http://www.intelcompute.com -Original Message- From: dan whelan d...@adicio.com Reply-to:

Re: DataImportHandler w/ multivalued fields

2011-12-01 Thread Briggs Thompson
In addition, I tried a query like below and changed the column definition to field column=raw_tag name=raw_tag splitBy=, / and still no luck. It is indexing the full content now but not multivalued. It seems like the splitBy ins't working properly. select

Re: DataImportHandler w/ multivalued fields

2011-12-01 Thread Rahul Warawdekar
Hi Briggs, By saying multivalued fields are not getting indexed prperly, do you mean to say that you are not able to search on those fields ? Have you tried actually searching your Solr index for those multivalued terms and make sure if it returns the search results ? One possibility could be

Re: DataImportHandler w/ multivalued fields

2011-12-01 Thread Briggs Thompson
Hey Rahul, Thanks for the response. I actually just figured it thankfully :). To answer your question, the raw_tag is indexed and not stored (tokenized), and then there is a copyField for raw_tag to raw_tag_string which would be used for facets. That *should have* been displayed in the results.

Dealing with dashes with solr.PatternReplaceCharFilterFactory

2011-12-01 Thread Aaron Wong
Hi all, We're encountering a problem with querying terms with dashes (and other non-alphanumeric characters). For example, we use PatternReplaceCharFilterFactory to replace dashes with blank characters for both index and query, however any terms with dashes in them will not return any results.

Re: Configuring the Distributed

2011-12-01 Thread Mark Miller
On Thu, Dec 1, 2011 at 10:08 AM, Jamie Johnson jej2...@gmail.com wrote: I am currently looking at the latest solrcloud branch and was wondering if there was any documentation on configuring the DistributedUpdateProcessor? What specifically in solrconfig.xml needs to be added/modified to make

Re: Error in New Solr version

2011-12-01 Thread Samuel García Martínez
You are using the uncomitted FieldCollapse component for 1.4.x. Now, on 3.x field collapse component is not that anymore. You must remove it and configure the out-of-the-box one. On Thu, Dec 1, 2011 at 11:34 AM, Vadim Kisselmann v.kisselm...@googlemail.com wrote: Hi, comment out the lines

Re: mysolr python client

2011-12-01 Thread Óscar Marín Miró
Nice job, pythonic solr access!! Thanks for the effort On Thu, Dec 1, 2011 at 5:53 PM, Rubén Abad rua...@gmail.com wrote: Hi Jens, Our objective with mysolr was to create a pythonic Apache Solr binding. But we also have been working in speed and concurrency. We always use the Python

Re: Configuring the Distributed

2011-12-01 Thread Jamie Johnson
Thanks I will try this first thing in the morning. On Thu, Dec 1, 2011 at 3:39 PM, Mark Miller markrmil...@gmail.com wrote: On Thu, Dec 1, 2011 at 10:08 AM, Jamie Johnson jej2...@gmail.com wrote: I am currently looking at the latest solrcloud branch and was wondering if there was any

Re: Configuring the Distributed

2011-12-01 Thread Jamie Johnson
Another question, is there any support for repartitioning of the index if a new shard is added? What is the recommended approach for handling this? It seemed that the hashing algorithm (and probably any) would require the index to be repartitioned should a new shard be added. On Thu, Dec 1,

Re: Configuring the Distributed

2011-12-01 Thread Mark Miller
Not yet - we don't plan on working on this until a lot of other stuff is working solid at this point. But someone else could jump in! There are a couple ways to go about it that I know of: A more long term solution may be to start using micro shards - each index starts as multiple indexes. This

Re: Configuring the Distributed

2011-12-01 Thread Jamie Johnson
I am not familiar with the index splitter that is in contrib, but I'll take a look at it soon. So the process sounds like it would be to run this on all of the current shards indexes based on the hash algorithm. Is there also an index merger in contrib which could be used to merge indexes? I'm

Re: Configuring the Distributed

2011-12-01 Thread Mark Miller
On Dec 1, 2011, at 7:20 PM, Jamie Johnson wrote: I am not familiar with the index splitter that is in contrib, but I'll take a look at it soon. So the process sounds like it would be to run this on all of the current shards indexes based on the hash algorithm. Not something I've thought

Re: Configuring the Distributed

2011-12-01 Thread Jamie Johnson
hmmm.This doesn't sound like the hashing algorithm that's on the branch, right? The algorithm you're mentioning sounds like there is some logic which is able to tell that a particular range should be distributed between 2 shards instead of 1. So seems like a trade off between repartitioning

Multithreaded DIH bug

2011-12-01 Thread Mark
I'm trying to use multiple threads with DIH but I keep receiving the following error.. Operation not allowed after ResultSet closed Is there anyway I can fix this? Dec 1, 2011 4:38:47 PM org.apache.solr.common.SolrException log SEVERE: Full Import failed:java.lang.RuntimeException: Error in

Re: (fq=field1:val1 AND field2:val2) VS fq=field1:val1fq=field2:val2 and filterCache

2011-12-01 Thread Shawn Heisey
On 12/1/2011 8:01 AM, Antoine LE FLOC'H wrote: Is there any difference in the way things are stored in the filterCache if I do (fq=field1:val1 AND field2:val2) or fq=field1:valfq=field2:val2 eventhough these are logically identical ? What get stored exactly ? Also can you point me to where in

Re: Configuring the Distributed

2011-12-01 Thread Mark Miller
Right now lets say you have one shard - everything there hashes to range X. Now you want to split that shard with an Index Splitter. You divide range X in two - giving you two ranges - then you start splitting. This is where the current Splitter needs a little modification. You decide which

Re: Configuring the Distributed

2011-12-01 Thread Jamie Johnson
Yes, the ZK method seems much more flexible. Adding a new shard would be simply updating the range assignments in ZK. Where is this currently on the list of things to accomplish? I don't have time to work on this now, but if you (or anyone) could provide direction I'd be willing to work on this

Re: Configuring the Distributed

2011-12-01 Thread Ted Dunning
Of course, resharding is almost never necessary if you use micro-shards. Micro-shards are shards small enough that you can fit 20 or more on a node. If you have that many on each node, then adding a new node consists of moving some shards to the new machine rather than moving lots of little

Re: Configuring the Distributed

2011-12-01 Thread Mark Miller
In this case we are still talking about moving a whole index at a time rather than lots of little documents. You split the index into two, and then ship one of them off. The extra cost you can avoid with micro sharding will be the cost of splitting the index - which could be significant for a

Re: Configuring the Distributed

2011-12-01 Thread Mark Miller
Sorry - missed something - you also have the added cost of shipping the new half index to all of the replicas of the original shard with the splitting method. Unless you somehow split on every replica at the same time - then of course you wouldn't be able to avoid the 'busy' replica, and it

Re: Configuring the Distributed

2011-12-01 Thread Jamie Johnson
So I couldn't resist, I attempted to do this tonight, I used the solrconfig you mentioned (as is, no modifications), I setup a 2 shard cluster in collection1, I sent 1 doc to 1 of the shards, updated it and sent the update to the other. I don't see the modifications though I only see the original

Re: Configuring the Distributed

2011-12-01 Thread Mark Miller
It's not full of details yet, but there is a JIRA issue here: https://issues.apache.org/jira/browse/SOLR-2595 On Thu, Dec 1, 2011 at 8:51 PM, Jamie Johnson jej2...@gmail.com wrote: Yes, the ZK method seems much more flexible. Adding a new shard would be simply updating the range assignments

Re: Configuring the Distributed

2011-12-01 Thread Mark Miller
Hmm...sorry bout that - so my first guess is that right now we are not distributing a commit (easy to add, just have not done it). Right now I explicitly commit on each server for tests. Can you try explicitly committing on server1 after updating the doc on server 2? I can start distributing

Re: Configuring the Distributed

2011-12-01 Thread Jamie Johnson
Thanks for the quick response. With that change (have not done numShards yet) shard1 got updated. But now when executing the following queries I get information back from both, which doesn't seem right http://localhost:7574/solr/select/?q=*:* docstr name=key1/strstr name=content_mvtxtupdated

Re: Configuring the Distributed

2011-12-01 Thread Mark Miller
Not sure offhand - but things will be funky if you don't specify the correct numShards. The instance to shard assignment should be using numShards to assign. But then the hash to shard mapping actually goes on the number of shards it finds registered in ZK (it doesn't have to, but really these

Re: Configuring the Distributed

2011-12-01 Thread Mark Miller
Getting late - didn't really pay attention to your code I guess - why are you adding the first doc without specifying the distrib update chain? This is not really supported. It's going to just go to the server you specified - even with everything setup right, the update might then go to that

Possible to facet across two indices, or document types in single index?

2011-12-01 Thread Jeff Schmidt
Hello: I'm trying to relate together two different types of documents. Currently I have 'node' documents that reside in one index (core), and 'product mapping' documents that are in another index. The product mapping index is used to map tenant products to nodes. The nodes are canonical

XPathEntityProcessor, Fields without Content, and Null-backup

2011-12-01 Thread Michael Watts
Hello Solr and Solr-Users, I can't confidently say I completeley understand all that these classes so boldy tackle (that is, XPathEntityProcessor and XPathRecordReader) , but there may be someone who does. Nonetheless,I think I've got some or most of this right, and more likely there are more

Re: Configuring the Distributed

2011-12-01 Thread Jamie Johnson
Really just trying to do a simple add and update test, the chain missing is just proof of my not understanding exactly how this is supposed to work. I modified the code to this String key = 1; SolrInputDocument solrDoc = new SolrInputDocument();

Re: Configuring the Distributed

2011-12-01 Thread Ted Dunning
Well, this goes both ways. It is not that unusual to take a node down for maintenance of some kind or even to have a node failure. In that case, it is very nice to have the load from the lost node be spread fairly evenly across the remaining cluster. Regarding the cost of having several

Re: Configuring the Distributed

2011-12-01 Thread Ted Dunning
With micro-shards, you can use random numbers for all placements with minor constraints like avoiding replicas sitting in the same rack. Since the number of shards never changes, things stay very simple. On Thu, Dec 1, 2011 at 6:44 PM, Mark Miller markrmil...@gmail.com wrote: Sorry - missed