solr-user-unsubscribe
Re: solr4.1 createNodeSet requires ip addresses?
Hi, I created a ticket and try to describe here https://issues.apache.org/jira/browse/SOLR-4471 Actually search speed, ram and memory usage on solr 4.x compared with 3.6. looks good, only the network is blocked by full copy index from slave. André On 16.02.13 03:25, Mark Miller markrmil...@gmail.com wrote: For 4.2, I'll try and put in https://issues.apache.org/jira/browse/SOLR-4078 soon. Not sure about the behavior your seeing - you might want to file a JIRA issue. - Mark On Feb 15, 2013, at 8:17 PM, Gary Yngve gary.yn...@gmail.com wrote: Hi all, I've been unable to get the collections create API to work with createNodeSet containing hostnames, both localhost and external hostnames. I've only been able to get it working when using explicit IP addresses. It looks like zk stores the IP addresses in the clusterstate.json and live_nodes. Is it possible that Solr Cloud is not doing any hostname resolving but just looking for an explicit match with createNodeSet? This is kind of annoying, in that I am working with EC2 instances and consider it pretty lame to need to use elastic IPs for internal use. I'm hacking around it now (looking up the eth0 inet addr on each machine), but I'm not happy about it. Has anyone else found a better solution? The reason I want to specify explicit nodes for collections is so I can have just one zk ensemble managing collections across different environments that will go up and down independently of each other. Thanks, Gary
problem with full copy on replication solr4.1
Hi, I upgrade solr form 3.6 to 4.1. Since them the replication is full copy the index from master. Master is delta import via DIH every 10min. Slave poll interval is 10sec. After debug and search I found patch in SOLR-4413. Problem was slave is checking the wrong directory (index/ instead of index.{timestamp}/). I create a patched version of solr using SOLR-4413. Not it works for the first replication. I checked the logs on slave and see it's skipping old files. Generation of master slave is same, only the index version is different. On next poll slave is forcing full copy from master again, I think because the different index version. Now slave start full download all files and is finally in sync with the master (gen+version) until next commit on master. Question: Any ideas how can I avoid this behavior - why slave generates a new version of the index? Thanks, André
RE: Ebay Kleinanzeigen and Auto Suggest
Hi, yes we do. If you use a limit number of categories (like 100) you can use dynamic fields with the termscomponent and by choosing a category specific prefix, like: {schema.xml} ... dynamicField name=*_suggestion type=textAS indexed=true stored=false multiValued=true omitNorms=true/ ... {schema.xml} And within data import handler we script prefix from given category: {data-config.xml} function setCatPrefixFields(row) { var catId = row.get('category'); var title = row.get('freetext'); var cat_prefix = c + catId + _suggestion; return row; } {data-config.xml} Then you we adapt these in our application layer by a specific request handler, regarding these prefix. Pro: - works fine for limit number of categories Con: - index is getting bigger, we measure increasing by ~40 percent Regards André Charton -Original Message- From: Eric Grobler [mailto:impalah...@googlemail.com] Sent: Wednesday, April 27, 2011 9:56 AM To: solr-user@lucene.apache.org Subject: Re: Ebay Kleinanzeigen and Auto Suggest Hi Otis, The new Solr 3.1 Suggester also does not support filter queries. Is anyone using shingles with faceting on large data? Regards Ericz On Tue, Apr 26, 2011 at 10:06 PM, Otis Gospodnetic otis_gospodne...@yahoo.com wrote: Hi Eric, Before using the terms component, allow me to point out: * http://sematext.com/products/autocomplete/index.html (used on http://search-lucene.com/ for example) * http://wiki.apache.org/solr/Suggester Otis Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch Lucene ecosystem search :: http://search-lucene.com/ - Original Message From: Eric Grobler impalah...@googlemail.com To: solr-user@lucene.apache.org Sent: Tue, April 26, 2011 1:11:11 PM Subject: Ebay Kleinanzeigen and Auto Suggest Hi Someone told me that ebay is using solr. I was looking at their Auto Suggest implementation and I guess they are using Shingles and the TermsComponent. I managed to get a satisfactory implementation but I have a problem with category specific filtering. Ebay suggestions are sensitive to categories like Cars and Pets. As far as I understand it is not possible to using filters with a term query. Unless one uses multiple fields or special prefixes for the words to index I cannot think how to implement this. Is their perhaps a workaround for this limitation? Best Regards EricZ --- I am have a shingle type like: fieldType name=shingle_text class=solr.TextField positionIncrementGap=100 analyzer tokenizer class=solr.StandardTokenizerFactory/ filter class=solr.ShingleFilterFactory minShingleSize=2 maxShingleSize=4 / filter class=solr.LowerCaseFilterFactory / /analyzer /fieldType and a query like http://localhost:8983/solr/terms?q=*%3A*terms.fl=suggest_textterms.sort=countterms.prefix=audi i
RE: data-config.xml: delta-import unclear behaviour pre/postDeleteImportQuery with clean
Hi Manu, from 1.4.1 it is invoked if postImportDeleteQuery is not null and clean is true, see Code ... String delQuery = e.allAttributes.get(preImportDeleteQuery); if (dataImporter.getStatus() == DataImporter.Status.RUNNING_DELTA_DUMP) { cleanByQuery(delQuery, fullCleanDone); doDelta(); delQuery = e.allAttributes.get(postImportDeleteQuery); if (delQuery != null) { fullCleanDone.set(false); cleanByQuery(delQuery, fullCleanDone); } } ... private void cleanByQuery(String delQuery, AtomicBoolean completeCleanDone) { delQuery = getVariableResolver().replaceTokens(delQuery); if (requestParameters.clean) { if (delQuery == null !completeCleanDone.get()) { writer.doDeleteAll(); completeCleanDone.set(true); } else if (delQuery != null) { writer.deleteByQuery(delQuery); } } } André -Original Message- From: manuel aldana [mailto:ald...@gmx.de] Sent: Montag, 31. Januar 2011 09:40 To: solr-user@lucene.apache.org Subject: data-config.xml: delta-import unclear behaviour pre/postDeleteImportQuery with clean I have some unclear behaviour with using clean and pre/postImportDeleteQuery for delta-imports. The docs under http://wiki.apache.org/solr/DataImportHandler#Configuration_in_data-config.xml are not clear enough. My observation is: - preImportDeleteQuery is only executed if clean=true is set - postImportDeleteQuery is only executed if clean=true is set - if preImportDeleteQuery is ommitted and clean=true then the whole index is cleaned = config with postImportDeleteQuery itself won't work Is above correct? I don't need preImportDeleteQuery only post is necessary. But to make post work I am doubling the post to pre so clean=true doesn't delete whole index. This looks a bit like a workaround as wanted behaviour. solr version is 1.4.1 thanks. -- manuel aldana mail: ald...@gmx.de | man...@aldana-online.de blog: www.aldana-online.de
Default filter in solr config (+filter document by now for near time index feeling)
Hi, I have this use case: I update index every 10 min on a master-solr (via batch) and replicate them to slaves. The clients use the slaves. From client view now it's ugly: it looks like we change our index only every 10 minutes. Sure, but idea now is to index all documents with a index date, set this index date 10 min to the future and create a filter INDEX_DATE:[* TO NOW]. Question 1: is it possible to set this as part of solr-config, so every implementation against the server will regard this. Question 2: From caching point of view this sounds a little ugly, is it - anybody tried this? Thanks, André