Re: spell suggestions help
hi jack I am using whitespace toknizer only and before this im using pattern replace to replace amp; with and but its not working I guess. my query analyser: /analyzer analyzer type=query charFilter class=solr.PatternReplaceCharFilterFactory pattern=amp; replacement=and/ tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.SynonymFilterFactory synonyms=synonyms.txt ignoreCase=true expand=true/ filter class=solr.StopFilterFactory ignoreCase=true words=lang/stopwords_en.txt enablePositionIncrements=true On Thu, Apr 11, 2013 at 6:03 PM, Jack Krupansky j...@basetechnology.comwrote: Try replacing standard tokenizer with whitespace tokenizer in your field types. And make sure not to use any other token filters that might discard special characters (or provide a character map if they support one.) Also, be side to try your test terms in the Solr Admin UI ANalyzer page to see that the is preserved or which stage in term analysis it gets discarded. -- Jack Krupansky -Original Message- From: Rohan Thakur Sent: Thursday, April 11, 2013 7:39 AM To: solr-user@lucene.apache.org Subject: Re: spell suggestions help urlencode replaces with space thus resulting in results that contains even the single terms like in the case of mobile accessories it replaces it with mobile accessories and results the document containing even accessories which i dont want. how to tackle this I tried using pattern replace filter at query time to replace with and but it did not worked I used amp; = replace with and in this but did not worked any guess our help.. thanks regards rohan On Thu, Apr 11, 2013 at 4:39 PM, Rohan Thakur rohan.i...@gmail.com wrote: hi erick do we have to do urlencoding from the php side or does solr supports urlencode? On Thu, Apr 11, 2013 at 5:57 AM, Erick Erickson erickerick...@gmail.com **wrote: Try URL encoding it and/or escaping the On Tue, Apr 9, 2013 at 2:32 AM, Rohan Thakur rohan.i...@gmail.com wrote: hi all one thing I wanted to clear is for every other query I have got correct suggestions but these 2 cases I am not getting what suppose to be the suggestions: 1) I have kettle(doc frequency =5) and cable(doc frequecy=1) word indexed in direct solr spell cheker..but when I query for cattle I get cable as only suggestion and not kettle why is this happening i want to get kettle in suggestion as well im using jarowinkler distance according to which score for cattle = cable which is coming out to be 0.857 and for cattle = kettle which is coming out to be 0.777 kettle should also come in suggestions but its not how can I correct this any one. 2) how to query for sentence like hand blandar chopper as is delimiter for solr query and thus this query is returning error. thanks in advance regards Rohan
XInclude in data-config.xml
hello. is it possible to include some entities with XInclude in my data-config.xml? i tried with this line: xi:include href=solr/entity.xml xmlns:xi=http://www.w3.org/2001/XInclude; / in my entity.xml is something like: entity name=name query=SELECT * FROM table/entity some ideas, why does not work? this blog sounds good for me =( http://www.raspberry.nl/2010/10/30/solr-xml-config-includes/ -- View this message in context: http://lucene.472066.n3.nabble.com/XInclude-in-data-config-xml-tp4055487.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Slow qTime for distributed search
Manuel Le Normand, I am sorry but I want to learn something. You said you have 40 dedicated servers. What is your total document count, total document size, and total shard size? 2013/4/11 Manuel Le Normand manuel.lenorm...@gmail.com Hi, We have different working hours, sorry for the reply delay. Your assumed numbers are right, about 25-30Kb per doc. giving a total of 15G per shard, there are two shards per server (+2 slaves that should do no work normally). An average query has about 30 conditions (OR AND mixed), most of them textual, a small part on dateTime. They use only simple queries (no facet, filters etc.) as it is taken from the actual query set of my entreprise that works with an old search engine. As we said, if the shards in collection1 and collection2 have the same number of docs each (and same RAM CPU per shard), it is apparently not a slow IO issue, right? So the fact of not having cached all my index doesn't seem the be the bottleneck.Moreover, i do store the fields but my query set requests only the id's and rarely snippets so I'd assume that the plenty of RAM i'd give the OS wouldn't make any difference as these *.fdt files don't need to get cached. The conclusion i get to is that the merging issue is the problem, and the only possibility of outsmarting it is to distribute to much fewer shards, meaning that i'll get back to few millions of docs per shard which are about linearly slower with the num of docs per shard. Though the latter should improve if i give much more RAM per server. I'll try tweaking a bit my schema and making better use of solr cache (filter query as an example), but i have something telling me the problem might be elsewhere. My main clue to it is that merging seems a simple CPU task, and tests show that even with a small amount of responses it takes a long time (and clearly the merging task on few docs is very short) On Wed, Apr 10, 2013 at 2:50 AM, Shawn Heisey s...@elyograg.org wrote: On 4/9/2013 3:50 PM, Furkan KAMACI wrote: Hi Shawn; You say that: *... your documents are about 50KB each. That would translate to an index that's at least 25GB* I know we can not say an exact size but what is the approximately ratio of document size / index size according to your experiences? If you store the fields, that is actual size plus a small amount of overhead. Starting with Solr 4.1, stored fields are compressed. I believe that it uses LZ4 compression. Some people store all fields, some people store only a few or one - an ID field. The size of stored fields does have an impact on how much OS disk cache you need, but not as much as the other parts of an index. It's been my experience that termvectors take up almost as much space as stored data for the same fields, and sometimes more. Starting with Solr 4.2, termvectors are also compressed. Adding docValues (new in 4.2) to the schema will also make the index larger. The requirements here are similar to stored fields. I do not know whether this data gets compressed, but I don't think it does. As for the indexed data, this is where I am less clear about the storage ratios, but I think you can count on it needing almost as much space as the original data. If the schema uses types or filters that produce a lot of information, the indexed data might be larger than the original input. Examples of data explosions in a schema: trie fields with a non-zero precisionStep, the edgengram filter, the shingle filter. Thanks, Shawn
RE: SolrCloud leader to replica
Hi Otis and Timothy, Thanks very much for helps, sure I will test to make sure. What I mentioned before is a mere possibility, likely you are correct: the small delay may not matter in reality (yes we do use the same way to do pagination and no isse ever happened even once). Surely solr is enormously valuable to us and we really appreciate your helps! Lisheng -Original Message- From: Otis Gospodnetic [mailto:otis.gospodne...@gmail.com] Sent: Thursday, April 11, 2013 5:27 PM To: solr-user@lucene.apache.org Subject: Re: SolrCloud leader to replica Hi, I think Timothy is right about what Lisheng is really after, which is consistency. I agree with what Timothy is implying here - changes of search being inconsistent are very, very small. I'm guessing Lisheng is trying to solve a problem he doesn't actually have yet? Also, think about a non-SolrCloud solution. What happens when a user pages through results? Typically that just re-runs the same query, but with a different page offset. What happens if between page 1 and page 2 the index changes and a searcher is reopened? Same sort of problem can happen, right? Yet, in a few hundred client engagements involving Solr or ElasticSearch I don't recall this ever being an issue. Otis -- Solr ElasticSearch Support http://sematext.com/ On Thu, Apr 11, 2013 at 8:13 PM, Timothy Potter thelabd...@gmail.com wrote: Hmmm ... I was following this discussion but then got confused when Lisheng said to change Solr to compromise consistency in order to increase availability when your concern is how long replica is behind leader. Seems you want more consistency vs. less in this case? One of the reasons behind Solr's leader election approach is to achieve low-latency eventual consistency (Mark's term from the linked to discussion). Un-committed docs are only visible if you use real-time get, in which case the request is served by the shard leader (or replica) from its update log. I suppose there's a chance of a few millis between the leader having the request in its tlog and the replica having the doc it its tlog but that seems like the nature of the beast. Meaning that Solr never promised to be 100% consistent at millisecond granularity in a distributed model - any small time-window between what a leader has and replica are probably network latency which you should solve outside of Solr. I suspect you could direct all your real-time get requests to leaders only using some smart client like CloudSolrServer if it mattered that much. Otherwise, all other queries require the document to be committed to be visible. I suppose there is a very small window when a new searcher is open on the leader and the new searcher is not yet open on the replica. However, with soft-commits, that too seems like a milli or two based on network latency. @Shawn - yes, I've actually seen this work in my cluster. We lose replicas from time-to-time and indexing keeps on trucking. On Thu, Apr 11, 2013 at 4:51 PM, Zhang, Lisheng lisheng.zh...@broadvision.com wrote: Hi Otis, Thanks very much for helps, your explanation is very clear. My main concern is not the return status for indexing calls (although which is also important), my main concern is how long replica is behind the leader (or putting in your way, how consistent search picture is to client A and B). Our application requires clients see same result whether he hits leader or replica, so it seems we do have a problem here. If no better solution I may consider to change solr4 a little (I have not read solr4x fully yet) to compromise consistency (C) in order to increase availability (A), on a high level do you see serious problems in this approach (I am familiar with lucene/solr code to some extent)? Thanks and best regards, Lisheng -Original Message- From: Otis Gospodnetic [mailto:otis.gospodne...@gmail.com] Sent: Thursday, April 11, 2013 2:50 PM To: solr-user@lucene.apache.org Subject: Re: SolrCloud leader to replica But note that I misspoke, which I realized after re-reading the thread I pointed you to. Mark explains it nicely there: * the index call returns only when (and IF!) indexing to all replicas succeeds BUT, that should not be mixed with what search clients see! Just because the indexing client sees the all or nothing situation depending on whether indexing was successful on all replicas does NOT mean that search clients will always see a 100% consistent picture. Client A could hit the leader and see a newly indexed document, while client B could query the replica and not see that same document simply because the doc hasn't gotten there yet, or because soft commit hasn't happened just yet. Otis -- Solr ElasticSearch Support http://sematext.com/ On Thu, Apr 11, 2013 at 4:39 PM, Zhang, Lisheng lisheng.zh...@broadvision.com wrote: Thanks very much for your helps! -Original Message- From: Otis Gospodnetic
Re: XInclude in data-config.xml
Are you sure your original problem is not fixable with resolvable properties and variable substitutions ${varname}? Because Solr has good support for that. Otherwise, check that you have the right relative file path. I am not sure what the XML processor think it is. Use truss/strace on Unix/Mac and Process Monitor on Windows. It is often faster to check that to try to guess. Regards, Alex Personal blog: http://blog.outerthoughts.com/ LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch - Time is the quality of nature that keeps events from happening all at once. Lately, it doesn't seem to be working. (Anonymous - via GTD book) On Fri, Apr 12, 2013 at 3:31 AM, stockii stock.jo...@googlemail.com wrote: hello. is it possible to include some entities with XInclude in my data-config.xml? i tried with this line: xi:include href=solr/entity.xml xmlns:xi=http://www.w3.org/2001/XInclude; / in my entity.xml is something like: entity name=name query=SELECT * FROM table/entity some ideas, why does not work? this blog sounds good for me =( http://www.raspberry.nl/2010/10/30/solr-xml-config-includes/ -- View this message in context: http://lucene.472066.n3.nabble.com/XInclude-in-data-config-xml-tp4055487.html Sent from the Solr - User mailing list archive at Nabble.com.
how to migrate solr 1.4 index to solr 4.2 index
Hi any body can help on my below question please. how to migrate solr 1.4 index to solr 4.2 index? I have do the following but not work completely. i have migrate 1.4 index to 3.5 index and it's done successfully. but now when i try to migrate 3.5 index to 4.2 index then it is not migrate successfully and give me the below error. INFO: [] webapp=/solr35 path=/replication params={file=_zp.nrmcommand=filecontentchecksum=truegeneration=1190qt=/replicationwt=filestream} status=0 QTime=0 Apr 12, 2013 4:50:03 PM org.apache.solr.core.SolrCore execute INFO: [] webapp=/solr35 path=/replication params={file=_yj.frqcommand=filecontentchecksum=truegeneration=1190qt=/replicationwt=filestream} status=0 QTime=0 Apr 12, 2013 4:50:03 PM org.apache.solr.core.SolrCore execute INFO: [] webapp=/solr35 path=/replication params={file=_zp.tiscommand=filecontentchecksum=truegeneration=1190qt=/replicationwt=filestream} status=0 QTime=0 Apr 12, 2013 4:50:03 PM org.apache.solr.core.SolrCore execute INFO: [] webapp=/solr35 path=/replication params={file=_yj_8.delcommand=filecontentchecksum=truegeneration=1190qt=/replicationwt=filestream} status=0 QTime=0 Apr 12, 2013 4:50:03 PM org.apache.solr.handler.SnapPuller fetchLatestIndex INFO: Total time taken for download : 3 secs Apr 12, 2013 4:50:04 PM org.apache.solr.update.DefaultSolrCoreState newIndexWriter INFO: Creating new IndexWriter... Apr 12, 2013 4:50:04 PM org.apache.solr.update.DefaultSolrCoreState newIndexWriter INFO: Waiting until IndexWriter is unused... core=collection1 Apr 12, 2013 4:50:04 PM org.apache.solr.core.CachingDirectoryFactory closeCacheValue INFO: looking to close D:\solr421\data\index.2013041216531 [CachedDirrefCount=0;path=D:\solr421\data\index.2013041216531;done=true] Apr 12, 2013 4:50:04 PM org.apache.solr.core.CachingDirectoryFactory close INFO: Closing directory: D:\solr421\data\index.2013041216531 Apr 12, 2013 4:50:04 PM org.apache.solr.core.CachingDirectoryFactory closeCacheValue INFO: Removing directory before core close: D:\solr421\data\index.2013041216531 Apr 12, 2013 4:50:04 PM org.apache.solr.common.SolrException log SEVERE: SnapPull failed :org.apache.solr.common.SolrException: Index fetch failed : at org.apache.solr.handler.SnapPuller.fetchLatestIndex(SnapPuller.java:459) at org.apache.solr.handler.ReplicationHandler.doFetch(ReplicationHandler.java:281) at org.apache.solr.handler.SnapPuller$1.run(SnapPuller.java:222) at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) at java.util.concurrent.FutureTask$Sync.innerRunAndReset(Unknown Source) at java.util.concurrent.FutureTask.runAndReset(Unknown Source) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(Unknown Source) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Unknown Source) at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source) Caused by: org.apache.lucene.index.IndexFormatTooOldException: Format version is not supported (resource: SimpleFSIndexInput(path=D:\solr421\data\index\_yj.fdx)): 1 (needs to be between 2 and 3). This version of Lucene only supports indexes created with release 3.0 and later. at org.apache.lucene.codecs.lucene3x.Lucene3xStoredFieldsReader.checkCodeVersion(Lucene3xStoredFieldsReader.java:119) at org.apache.lucene.codecs.lucene3x.Lucene3xSegmentInfoReader.readLegacyInfos(Lucene3xSegmentInfoReader.java:74) at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:312) at org.apache.lucene.index.SegmentInfos$1.doBody(SegmentInfos.java:347) at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:783) at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:630) at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:343) at org.apache.lucene.index.IndexWriter.init(IndexWriter.java:673) at org.apache.solr.update.SolrIndexWriter.init(SolrIndexWriter.java:77) at org.apache.solr.update.SolrIndexWriter.create(SolrIndexWriter.java:64) at org.apache.solr.update.DefaultSolrCoreState.createMainIndexWriter(DefaultSolrCoreState.java:198) at org.apache.solr.update.DefaultSolrCoreState.newIndexWriter(DefaultSolrCoreState.java:180) at org.apache.solr.update.DirectUpdateHandler2.newIndexWriter(DirectUpdateHandler2.java:615) at org.apache.solr.handler.SnapPuller.openNewWriterAndSearcher(SnapPuller.java:622) at org.apache.solr.handler.SnapPuller.fetchLatestIndex(SnapPuller.java:446) ... 10 more Apr 12, 2013 4:50:23 PM org.apache.solr.core.SolrCore execute INFO: [] webapp=/solr35 path=/replication params={slave=falsecommand=detailsqt=/replicationwt=javabinversion=2} status=0 QTime=0 Apr 12,
Re: XInclude in data-config.xml
Is your data-config.xml file located in your Solr conf directory? You have solr/ at the front of your path, so is the included file really in conf/solr? Otherwise, this should work. Make sure you only have a single XML element in the included file. -- Jack Krupansky -Original Message- From: stockii Sent: Friday, April 12, 2013 3:31 AM To: solr-user@lucene.apache.org Subject: XInclude in data-config.xml hello. is it possible to include some entities with XInclude in my data-config.xml? i tried with this line: xi:include href=solr/entity.xml xmlns:xi=http://www.w3.org/2001/XInclude; / in my entity.xml is something like: entity name=name query=SELECT * FROM table/entity some ideas, why does not work? this blog sounds good for me =( http://www.raspberry.nl/2010/10/30/solr-xml-config-includes/ -- View this message in context: http://lucene.472066.n3.nabble.com/XInclude-in-data-config-xml-tp4055487.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: spell suggestions help
Be sure to use the Solr Admin UI Analysis page to verify what is happening at each stage of analysis. For BOTH index and query. You only showed us your query analyzer... show us the index analyzer as well. Did you make sure to delete the index data and completely reindex after changing the index analyzer? Or maybe your index and query analyzers are not in-sync and compatible. Do you have anything in your stopwords file? and is usually considered a stop word - so the stop filter would remove it. -- Jack Krupansky -Original Message- From: Rohan Thakur Sent: Friday, April 12, 2013 2:12 AM To: solr-user@lucene.apache.org Subject: Re: spell suggestions help hi jack I am using whitespace toknizer only and before this im using pattern replace to replace amp; with and but its not working I guess. my query analyser: /analyzer analyzer type=query charFilter class=solr.PatternReplaceCharFilterFactory pattern=amp; replacement=and/ tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.SynonymFilterFactory synonyms=synonyms.txt ignoreCase=true expand=true/ filter class=solr.StopFilterFactory ignoreCase=true words=lang/stopwords_en.txt enablePositionIncrements=true On Thu, Apr 11, 2013 at 6:03 PM, Jack Krupansky j...@basetechnology.comwrote: Try replacing standard tokenizer with whitespace tokenizer in your field types. And make sure not to use any other token filters that might discard special characters (or provide a character map if they support one.) Also, be side to try your test terms in the Solr Admin UI ANalyzer page to see that the is preserved or which stage in term analysis it gets discarded. -- Jack Krupansky -Original Message- From: Rohan Thakur Sent: Thursday, April 11, 2013 7:39 AM To: solr-user@lucene.apache.org Subject: Re: spell suggestions help urlencode replaces with space thus resulting in results that contains even the single terms like in the case of mobile accessories it replaces it with mobile accessories and results the document containing even accessories which i dont want. how to tackle this I tried using pattern replace filter at query time to replace with and but it did not worked I used amp; = replace with and in this but did not worked any guess our help.. thanks regards rohan On Thu, Apr 11, 2013 at 4:39 PM, Rohan Thakur rohan.i...@gmail.com wrote: hi erick do we have to do urlencoding from the php side or does solr supports urlencode? On Thu, Apr 11, 2013 at 5:57 AM, Erick Erickson erickerick...@gmail.com **wrote: Try URL encoding it and/or escaping the On Tue, Apr 9, 2013 at 2:32 AM, Rohan Thakur rohan.i...@gmail.com wrote: hi all one thing I wanted to clear is for every other query I have got correct suggestions but these 2 cases I am not getting what suppose to be the suggestions: 1) I have kettle(doc frequency =5) and cable(doc frequecy=1) word indexed in direct solr spell cheker..but when I query for cattle I get cable as only suggestion and not kettle why is this happening i want to get kettle in suggestion as well im using jarowinkler distance according to which score for cattle = cable which is coming out to be 0.857 and for cattle = kettle which is coming out to be 0.777 kettle should also come in suggestions but its not how can I correct this any one. 2) how to query for sentence like hand blandar chopper as is delimiter for solr query and thus this query is returning error. thanks in advance regards Rohan
Re: XInclude in data-config.xml
On 04/12/2013 09:31 AM, stockii wrote: hello. is it possible to include some entities with XInclude in my data-config.xml? We first struggled with XInclude, and then switched to use custom entities, which worked much better for our needs (reusing common parts in several SearchHandlers). ex. in solrconfig.xml : ?xml version=1.0 encoding=UTF-8 ? !DOCTYPE config [ !ENTITY solrconfigcommon SYSTEM solrconfig_common.xml ] config ... requestHandler name=search class=solr.SearchHandler default=true lst name=defaults solrconfigcommon; /lst ... /config in solrconfig_common.xml : !-- XML fragment used as entity in solrconfig.xml -- str name=echoParamsexplicit/str str name=defTypeedismax/str str name=qftitle^4 description^1/str str name=q*:*/str str name=q.alt*:*/str str name=rows20/str str name=q.opAND/str str name=pftitle~2^2.0/str HTH André Kelkoo SAS Société par Actions Simplifiée Au capital de € 4.168.964,30 Siège social : 8, rue du Sentier 75002 Paris 425 093 069 RCS Paris Ce message et les pièces jointes sont confidentiels et établis à l'attention exclusive de leurs destinataires. Si vous n'êtes pas le destinataire de ce message, merci de le détruire et d'en avertir l'expéditeur.
Re: XInclude in data-config.xml
Hi Andre, In 3.6.1 version when we used the entities in schema.xml for language analyzers ,it gave errors on server restart and core would not load . Regards, Sujatha 2013/4/12 Andre Bois-Crettez andre.b...@kelkoo.com On 04/12/2013 09:31 AM, stockii wrote: hello. is it possible to include some entities with XInclude in my data-config.xml? We first struggled with XInclude, and then switched to use custom entities, which worked much better for our needs (reusing common parts in several SearchHandlers). ex. in solrconfig.xml : ?xml version=1.0 encoding=UTF-8 ? !DOCTYPE config [ !ENTITY solrconfigcommon SYSTEM solrconfig_common.xml ] config ... requestHandler name=search class=solr.SearchHandler default=true lst name=defaults solrconfigcommon; /lst ... /config in solrconfig_common.xml : !-- XML fragment used as entity in solrconfig.xml -- str name=echoParamsexplicit/**str str name=defTypeedismax/str str name=qftitle^4 description^1/str str name=q*:*/str str name=q.alt*:*/str str name=rows20/str str name=q.opAND/str str name=pftitle~2^2.0/str HTH André Kelkoo SAS Société par Actions Simplifiée Au capital de € 4.168.964,30 Siège social : 8, rue du Sentier 75002 Paris 425 093 069 RCS Paris Ce message et les pièces jointes sont confidentiels et établis à l'attention exclusive de leurs destinataires. Si vous n'êtes pas le destinataire de ce message, merci de le détruire et d'en avertir l'expéditeur.
SolrCloud vs Solr master-slave replication
Hi, I've just posted this week an issue today with our Solr index: http://lucene.472066.n3.nabble.com/corrupted-index-in-slave-td4054769.html, Today, that error started to happen constantly for almost every request, and I created a JIRA issue becaue I thought it was a bug https://issues.apache.org/jira/browse/SOLR-4707 As you can read, at the end it was due to a fail in the Solr master-slave replication, and now I don't know if we should think about migrating to SolrCloud, since Solr master-slave replications seems not to fit to our requirements: * index size: ~20 million documents, ~9GB * ~1200 updates/min * ~1 queries/min (distributed over 2 slaves) MoreLikeThis, RealTimeGet, TermVectorComponent, SearchHandler I would thank you if anyone could help me to answer these questions: * Would it be advisable to migrate to SolrCloud? Would it have impact on the replication performance? * In that case, what would have better performance? to maintain a copy of the index in every server, or to use shard servers? * How many shards and replicas would you advice for ensuring high availability? Kind Regards, Victor -- View this message in context: http://lucene.472066.n3.nabble.com/SolrCloud-vs-Solr-master-slave-replication-tp4055541.html Sent from the Solr - User mailing list archive at Nabble.com.
solr spell correction help
hi all I have configured solr direct spell correction on spell field most of the words solr is correcting and giving suggestions but on some words like mentioned below is giving absurd results: 1) blender(indexed) 2) kettle(indexed) 3) electric(indexed) problems: 1) when I search for blandar its giving correct result as blender but when I search for blandars its not giving correction as blender 2) for this when I search for kettle the correct spell its still showing it to be false but not giving suggestions and even the results documents are showing up. and when I search for cettle its giving correct result as kettle but when I search for cattle its not giving any suggestions. 3) for this again when I search for electric the correct spell its showing it to be false in suggestions section but not giving any suggestions and documents are also returning for this spelling as its the correct one. even if I want solr to return samsung as spell suggetion if I search for sam what could be the configuration and what could be the solution for above problems? please help. thanks in advance regards Rohan
Re: how to migrate solr 1.4 index to solr 4.2 index
Try optimising your index in 3.5 before migrating to 4.2, as this should upgrade all segments to the 3.x format. Note however, you are likely to find issues using an index from 1.4 in a 4.x system. You will have to maintain the old field definitions using the old components, which will likely render some features non-functioning. For example, I have an index in that situation. My date fields were of type DateField not TrieDateField, meaning I could not use them in boost functions. If you can, try to think of and plan a way to re-index your content. Upayavira On Fri, Apr 12, 2013, at 12:24 PM, Montu v Boda wrote: Hi any body can help on my below question please. how to migrate solr 1.4 index to solr 4.2 index? I have do the following but not work completely. i have migrate 1.4 index to 3.5 index and it's done successfully. but now when i try to migrate 3.5 index to 4.2 index then it is not migrate successfully and give me the below error. INFO: [] webapp=/solr35 path=/replication params={file=_zp.nrmcommand=filecontentchecksum=truegeneration=1190qt=/replicationwt=filestream} status=0 QTime=0 Apr 12, 2013 4:50:03 PM org.apache.solr.core.SolrCore execute INFO: [] webapp=/solr35 path=/replication params={file=_yj.frqcommand=filecontentchecksum=truegeneration=1190qt=/replicationwt=filestream} status=0 QTime=0 Apr 12, 2013 4:50:03 PM org.apache.solr.core.SolrCore execute INFO: [] webapp=/solr35 path=/replication params={file=_zp.tiscommand=filecontentchecksum=truegeneration=1190qt=/replicationwt=filestream} status=0 QTime=0 Apr 12, 2013 4:50:03 PM org.apache.solr.core.SolrCore execute INFO: [] webapp=/solr35 path=/replication params={file=_yj_8.delcommand=filecontentchecksum=truegeneration=1190qt=/replicationwt=filestream} status=0 QTime=0 Apr 12, 2013 4:50:03 PM org.apache.solr.handler.SnapPuller fetchLatestIndex INFO: Total time taken for download : 3 secs Apr 12, 2013 4:50:04 PM org.apache.solr.update.DefaultSolrCoreState newIndexWriter INFO: Creating new IndexWriter... Apr 12, 2013 4:50:04 PM org.apache.solr.update.DefaultSolrCoreState newIndexWriter INFO: Waiting until IndexWriter is unused... core=collection1 Apr 12, 2013 4:50:04 PM org.apache.solr.core.CachingDirectoryFactory closeCacheValue INFO: looking to close D:\solr421\data\index.2013041216531 [CachedDirrefCount=0;path=D:\solr421\data\index.2013041216531;done=true] Apr 12, 2013 4:50:04 PM org.apache.solr.core.CachingDirectoryFactory close INFO: Closing directory: D:\solr421\data\index.2013041216531 Apr 12, 2013 4:50:04 PM org.apache.solr.core.CachingDirectoryFactory closeCacheValue INFO: Removing directory before core close: D:\solr421\data\index.2013041216531 Apr 12, 2013 4:50:04 PM org.apache.solr.common.SolrException log SEVERE: SnapPull failed :org.apache.solr.common.SolrException: Index fetch failed : at org.apache.solr.handler.SnapPuller.fetchLatestIndex(SnapPuller.java:459) at org.apache.solr.handler.ReplicationHandler.doFetch(ReplicationHandler.java:281) at org.apache.solr.handler.SnapPuller$1.run(SnapPuller.java:222) at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) at java.util.concurrent.FutureTask$Sync.innerRunAndReset(Unknown Source) at java.util.concurrent.FutureTask.runAndReset(Unknown Source) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(Unknown Source) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Unknown Source) at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source) Caused by: org.apache.lucene.index.IndexFormatTooOldException: Format version is not supported (resource: SimpleFSIndexInput(path=D:\solr421\data\index\_yj.fdx)): 1 (needs to be between 2 and 3). This version of Lucene only supports indexes created with release 3.0 and later. at org.apache.lucene.codecs.lucene3x.Lucene3xStoredFieldsReader.checkCodeVersion(Lucene3xStoredFieldsReader.java:119) at org.apache.lucene.codecs.lucene3x.Lucene3xSegmentInfoReader.readLegacyInfos(Lucene3xSegmentInfoReader.java:74) at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:312) at org.apache.lucene.index.SegmentInfos$1.doBody(SegmentInfos.java:347) at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:783) at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:630) at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:343) at org.apache.lucene.index.IndexWriter.init(IndexWriter.java:673) at org.apache.solr.update.SolrIndexWriter.init(SolrIndexWriter.java:77) at org.apache.solr.update.SolrIndexWriter.create(SolrIndexWriter.java:64) at
Re: solr spell correction help
blandars its not giving correction as blender They have an edit distance of 3. Direct Spell is limited to a maximum ED of 2. -- Jack Krupansky -Original Message- From: Rohan Thakur Sent: Friday, April 12, 2013 8:45 AM To: solr-user@lucene.apache.org Subject: solr spell correction help hi all I have configured solr direct spell correction on spell field most of the words solr is correcting and giving suggestions but on some words like mentioned below is giving absurd results: 1) blender(indexed) 2) kettle(indexed) 3) electric(indexed) problems: 1) when I search for blandar its giving correct result as blender but when I search for blandars its not giving correction as blender 2) for this when I search for kettle the correct spell its still showing it to be false but not giving suggestions and even the results documents are showing up. and when I search for cettle its giving correct result as kettle but when I search for cattle its not giving any suggestions. 3) for this again when I search for electric the correct spell its showing it to be false in suggestions section but not giving any suggestions and documents are also returning for this spelling as its the correct one. even if I want solr to return samsung as spell suggetion if I search for sam what could be the configuration and what could be the solution for above problems? please help. thanks in advance regards Rohan
updateLog in Solr 4.2
If i disable update log in solr 4.2 then i get the following exception SEVERE: :java.lang.NullPointerException at org.apache.solr.cloud.ShardLeaderElectionContext.runLeaderProcess(ElectionContext.java:190) at org.apache.solr.cloud.LeaderElector.runIamLeaderProcess(LeaderElector.java:156) at org.apache.solr.cloud.LeaderElector.checkIfIamLeader(LeaderElector.java:100) at org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:266) at org.apache.solr.cloud.ZkController.joinElection(ZkController.java:935) at org.apache.solr.cloud.ZkController.register(ZkController.java:761) at org.apache.solr.cloud.ZkController.register(ZkController.java:727) at org.apache.solr.core.CoreContainer.registerInZk(CoreContainer.java:908) at org.apache.solr.core.CoreContainer.registerCore(CoreContainer.java:892) at org.apache.solr.core.CoreContainer.register(CoreContainer.java:841) at org.apache.solr.core.CoreContainer$3.call(CoreContainer.java:638) at org.apache.solr.core.CoreContainer$3.call(CoreContainer.java:629) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:619) Apr 12, 2013 6:39:56 PM org.apache.solr.common.SolrException log SEVERE: null:org.apache.solr.common.cloud.ZooKeeperException: at org.apache.solr.core.CoreContainer.registerInZk(CoreContainer.java:931) at org.apache.solr.core.CoreContainer.registerCore(CoreContainer.java:892) at org.apache.solr.core.CoreContainer.register(CoreContainer.java:841) at org.apache.solr.core.CoreContainer$3.call(CoreContainer.java:638) at org.apache.solr.core.CoreContainer$3.call(CoreContainer.java:629) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:619) Caused by: java.lang.NullPointerException at org.apache.solr.cloud.ShardLeaderElectionContext.runLeaderProcess(ElectionContext.java:190) at org.apache.solr.cloud.LeaderElector.runIamLeaderProcess(LeaderElector.java:156) at org.apache.solr.cloud.LeaderElector.checkIfIamLeader(LeaderElector.java:100) at org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:266) at org.apache.solr.cloud.ZkController.joinElection(ZkController.java:935) at org.apache.solr.cloud.ZkController.register(ZkController.java:761) at org.apache.solr.cloud.ZkController.register(ZkController.java:727) at org.apache.solr.core.CoreContainer.registerInZk(CoreContainer.java:908) ... 12 more and solr fails to start . However if i add updatelog in my solrconfig.xml it starts. Is the update log parameter mandatory for solr4.2 -- View this message in context: http://lucene.472066.n3.nabble.com/updateLog-in-Solr-4-2-tp4055548.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: how to migrate solr 1.4 index to solr 4.2 index
hi thanks it work's for us Thanks Regards Montu v Boda -- View this message in context: http://lucene.472066.n3.nabble.com/how-to-migrate-solr-1-4-index-to-solr-4-2-index-tp4055531p409.html Sent from the Solr - User mailing list archive at Nabble.com.
Downloaded Solr 4.2.1 Source: Build Failing
common.compile-core: [javac] Compiling 337 source files to /Users/umeshprasad/Downloads/solr-4.2.1/solr/build/solr-core/classes/java [javac] /Users/umeshprasad/Downloads/solr-4.2.1/solr/core/src/java/org/apache/solr/handler/c *omponent/QueryComponent.java:765: cannot find symbol [javac] symbol : class ShardFieldSortedHitQueue [javac] location: class org.apache.solr.handler.component.QueryComponent [javac] ShardFieldSortedHitQueue queue;* [javac] ^ [javac] /Users/umeshprasad/Downloads/solr-4.2.1/solr/core/src/java/org/apache/solr/handler/component/QueryComponent.java:766: cannot find symbol [javac] symbol : class ShardFieldSortedHitQueue [javac] location: class org.apache.solr.handler.component.QueryComponent [javac] queue = new ShardFieldSortedHitQueue(sortFields, ss.getOffset() + ss.getCount()); [javac] ^ [javac] Note: Some input files use or override a deprecated API. [javac] Note: Recompile with -Xlint:deprecation for details. [javac] Note: Some input files use unchecked or unsafe operations. [javac] Note: Recompile with -Xlint:unchecked for details. [javac] 2 errors -- --- Thanks Regards Umesh Prasad
Need hook to know when replication backup is actually completed.
Hi, I'd like to use the backup command to create a backup of each shard leader's index periodically. This is for disaster recovery in case our data center goes offline. We use SolrCloud leader/replica for day-to-day fault-tolerance and it works great. The backup command (http://master_host:port/solr/replication?command=backup) works just fine but it returns immediately while the actual backup creation runs in the background on the shard leader. Is there any way to know when the actual backup is complete? I need that hook to then move the backup to another storage device outside of our data center, e.g. S3. What are others doing for this type of backup process? Thanks in advance. Tim
Which tokenizer or analizer should use and field type
my schema file is : copyField source=title dest =keyword/ copyField source=body dest =keyword/ copyField source=company_name dest=keyword/ copyField source=company_profile dest=keyword/ field name=title type=text_general indexed=true stored=true/ field name=body type=text_general indexed=true stored=true/ field name=company_name type=text_general indexed=true stored=true/ field name=company_profile type=text_general indexed=true stored=true/ fieldType name=text_general class=solr.TextField positionIncrementGap=100 analyzer type=index tokenizer class=solr.StandardTokenizerFactory/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt enablePositionIncrements=true / filter class=solr.LowerCaseFilterFactory/ /analyzer analyzer type=query tokenizer class=solr.StandardTokenizerFactory/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt enablePositionIncrements=true / filter class=solr.SynonymFilterFactory synonyms=synonyms.txt ignoreCase=true expand=true/ filter class=solr.LowerCaseFilterFactory/ /analyzer /fieldType values are like, title: Assistant Coach/ Junior Assistant body: p http://i.imgur.com/buPga.jpg br /br /Oil India Ltd. invites applications for the post of strongSr Medical Officer (Paediatrics) /strongbr / www.freshersworld.combr / strongQualification/strong : MD (Paediatrics) br /br / strongNo of Post/strong : 1URbr / br /strong Pay Scale/strong : Rs 32900 -58000 br / br / strongAge as on 11.04.2013/strong : 32 yrsbr / /ppstrongSelection Procedure : /strongSelection for the above post will be based on Written Test, Group Discussion (GD), Viva-Voce and Medical Examination.br / /p company_profile: pThe story of strongOil India Limited (OIL)/strong traces and symbolises the development and growth of the Indian petroleum industry. From the discovery of crude oil in the far east of India at Digboi, Assam in 1889 to its present status as a fully integrated upstream petroleum company, OIL has come far, crossing many milestones./p, company_name: Oil India Limited, please give me suggestion about field type i should use. keyword is copyfield i am using for search. i do not want to search on html content. How search will happen ? if i give words to search project assistant,manager it only should give me keyword have project assistance or manager. right now it is giving me results which has project or assistance or manager that is wrong case for me. Please give me solution for it. I have to complete that task by today thats why i am not able to do research on it. need field type definitions for each field. and how search query i'll write ?? thanks in advance -- View this message in context: http://lucene.472066.n3.nabble.com/Which-tokenizer-or-analizer-should-use-and-field-type-tp4055591.html Sent from the Solr - User mailing list archive at Nabble.com.
Spatial search question
We currently do a radius search from a given Lat/Long point and it works great. I have a new requirement to do a search on a larger radius from the same point, but not include the smaller radius. Kind of a donut (torus) shaped search. How would I do this (Solr 4)? Search where radius is between 20km and 40km for example? Thanks, Ken -- View this message in context: http://lucene.472066.n3.nabble.com/Spatial-search-question-tp4055597.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: tokenizer of solr
Jack, Thanks so much for this info. It's awesome. Ming On Thu, Apr 11, 2013 at 7:32 PM, Jack Krupansky j...@basetechnology.comwrote: In that case, use the types=wdfftypes.txt attribute of WDF and map @ and _ to ALPHA as shown in: http://wiki.apache.org/solr/**AnalyzersTokenizersTokenFilter**s#solr.** WordDelimiterFilterFactoryhttp://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.WordDelimiterFilterFactory . -- Jack Krupansky -Original Message- From: Mingfeng Yang Sent: Thursday, April 11, 2013 8:50 PM To: solr-user@lucene.apache.org Subject: Re: tokenizer of solr looks like it's due to the word delimiter filter. Anyone know if the protected file support regular expression or not? Ming On Thu, Apr 11, 2013 at 4:58 PM, Jack Krupansky j...@basetechnology.com* *wrote: Try the whitespace tokenizer. -- Jack Krupansky -Original Message- From: Mingfeng Yang Sent: Thursday, April 11, 2013 7:48 PM To: solr-user@lucene.apache.org Subject: tokenizer of solr Dear Solr users and developers, I am trying to index some documents some of which are twitter messages, and we have a problem when indexing retweet. Say a twitter user named jpc_108 post a tweet, and then someone retweet his msg, and now @jpc_108 become part of the tweet text body. Seems like before indexing, the tokenizer factory of solr turns @jpc_108 into jpc and 108, and when we search for jpc_108, it's not there anymore. Is there anyway we can keep jcp_108 when it appears as @jpc_108? Thanks, Ming-
Re: SolrCloud vs Solr master-slave replication
On 4/12/2013 6:45 AM, Victor Ruiz wrote: As you can read, at the end it was due to a fail in the Solr master-slave replication, and now I don't know if we should think about migrating to SolrCloud, since Solr master-slave replications seems not to fit to our requirements: * index size: ~20 million documents, ~9GB * ~1200 updates/min * ~1 queries/min (distributed over 2 slaves) MoreLikeThis, RealTimeGet, TermVectorComponent, SearchHandler I would thank you if anyone could help me to answer these questions: * Would it be advisable to migrate to SolrCloud? Would it have impact on the replication performance? * In that case, what would have better performance? to maintain a copy of the index in every server, or to use shard servers? * How many shards and replicas would you advice for ensuring high availability? The fact that your replication is producing a corrupt index suggests that your network, your server hardware, or your software install is unreliable. The TCP protocol used for all Solr communication (as well as the Internet in general) has error detection and retransmissions. I'm not saying that replication can't have bugs, but usually those bugs result in replication not working, they don't typically cause index corruption. I see a previous message where you say everything is on the same LAN with gigabit ethernet. There are a lot of things that can go wrong with gigabit. At the physical layer: Using cat5 cable instead of cat5e or cat6 can lead to problems. You could have a bad cable, or the RJ45 connectors could be badly crimped. If you are using patch panels, they may be bad or only rated for cat5. At layer 2, you can have duplex mismatches, common when one side is hard-set to full duplex and the other side is left at auto or is a dumb switch that can't be changed. Even if you have these problems, it still won't usually cause data corruption unless the hardware or OS is also faulty. One somewhat common example of a problem that can cause data corruption in network communication is buggy firmware on the network card, especially with Broadcom chips. Upgrading to the latest firmware will usually fix these problems. Now for your questions: SolrCloud doesn't use replication during normal operation. When you index, the indexing happens on all replicas in parallel. Replication does sometimes get used by SolrCloud, but only if a replica goes down and there's not enough information in the transaction log to reconstruct recent updates when it comes back up. As for whether or not to use shards: that's really up to you. Solr should have no trouble with a single-shard 9GB index that has 20 million documents, as long as you give enough memory to the java heap and have 8GB or so left over for the OS to cache the index. That means you want to have 12-16GB of RAM in each server. If Solr is not the only thing running on the hardware, then you'd want more RAM. For the update and query volume you have described, having plenty of RAM and lots of CPU cores will be critical. Thanks, Shawn
Re: Downloaded Solr 4.2.1 Source: Build Failing
: /Users/umeshprasad/Downloads/solr-4.2.1/solr/core/src/java/org/apache/solr/handler/c : *omponent/QueryComponent.java:765: cannot find symbol : [javac] symbol : class ShardFieldSortedHitQueue : [javac] location: class org.apache.solr.handler.component.QueryComponent : [javac] ShardFieldSortedHitQueue queue;* Weird ... can you provide us more details about the java compiler you are using? ShardFieldSortedHitQueue is a package protected class declared in ShardDoc.java (in the same package as QueryComponent). That isn't exactly a best practice, but it shouldn't be causing a compilation failure. -Hoss
Configure compositekey
Hi, i want to explore the hash document routing available in Solr 4.1. So please share the configuration for generating composite key. Thanks. -- View this message in context: http://lucene.472066.n3.nabble.com/Configure-compositekey-tp4055645.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Need hook to know when replication backup is actually completed.
Update to this ... did some code scanning and it looks like the backup status is available via the details command, e.g. lst name=backup str name=startTimeFri Apr 12 17:53:17 UTC 2013/str int name=fileCount120/int str name=statussuccess/str str name=snapshotCompletedAtFri Apr 12 17:58:22 UTC 2013/str /lst So with a little polling of the details command from my backup script and I'm good to go. If anyone knows of a more direct way, let me know otherwise I'm moving ahead with this approach. Cheers, Tim On Fri, Apr 12, 2013 at 9:31 AM, Timothy Potter thelabd...@gmail.comwrote: Hi, I'd like to use the backup command to create a backup of each shard leader's index periodically. This is for disaster recovery in case our data center goes offline. We use SolrCloud leader/replica for day-to-day fault-tolerance and it works great. The backup command ( http://master_host:port/solr/replication?command=backup) works just fine but it returns immediately while the actual backup creation runs in the background on the shard leader. Is there any way to know when the actual backup is complete? I need that hook to then move the backup to another storage device outside of our data center, e.g. S3. What are others doing for this type of backup process? Thanks in advance. Tim
Re: Solr 4.2.1 SSLInitializationException
: Thanks for your response. As I mentioned in my email, I would prefer : the application to not have access to the keystore. Do you know if there I'm confused ... it seems that you (or GlassFish) has created a Catch-22... You say you don't want the application to have access to the keystore, but aparently you (or glassfish) is explicitly setting javax.net.ssl.keyStore to tell the application what keystore to use. The keystore you specify has a password set on it, but you are not telling the application what the password is, so it can't use that keystore. If you don't wnat to application to have access to the keystore at all, have you tried unsetting javax.net.ssl.keyStore ? : is a way of specifying a different HttpClient implementation (e.g. : DefaultHttpClient rather than SystemDefaultHttpClient) ? In SolrJ client code you can specify whatever HttpClient implementation you want. In Solr (for it's use of talking to other nodes in distributed search, which is what is indicated in your stack trace) SystemDefaultHttpClient is hard coded. -Hoss
CSS appearing in Solr 4.2.1 logs
Hey guys, This sounds crazy, but does anyone see strange CSS/HTML in their Solr 4.2.x logs? Often I am finding entire CSS documents (likely from Solr's Admin) in my jetty's stderrout log. Example: 2013-04-12 00:23:20.363:WARN:oejh.HttpGenerator:Ignoring extra content /** * @license RequireJS order 1.0.5 Copyright (c) 2010-2011, The Dojo Foundation All Rights Reserved. * Available via the MIT or new BSD license. * see: http://github.com/jrburke/requirejs for details */ /*jslint nomen: false, plusplus: false, strict: false */ /*global require: false, define: false, window: false, document: false, setTimeout: false */ //Specify that requirejs optimizer should wrap this code in a closure that //maps the namespaced requirejs API to non-namespaced local variables. /*requirejs namespace: true */ (function () { //Sadly necessary browser inference due to differences in the way //that browsers load and execute dynamically inserted javascript //and whether the script/cache method works when ordered execution is //desired. Currently, Gecko and Opera do not load/fire onload for scripts with //type=script/cache but they execute injected scripts in order //unless the 'async' flag is present. //However, this is all changing in latest browsers implementing HTML5 //spec. With compliant browsers .async true by default, and //if false, then it will execute in order. Favor that test first for forward //compatibility. var testScript = typeof document !== undefined typeof window !== undefined document.createElement(script), supportsInOrderExecution = testScript (testScript.async || ((window.opera Object.prototype.toString.call(window.opera) === [object Opera]) || //If Firefox 2 does not have to be supported, then //a better check may be: //('mozIsLocallyAvailable' in window.navigator) (MozAppearance in document.documentElement.style))), Due this, my logs are getting really huge, and sometimes it breaks my tail -F commands on the logs, printing what looks like binary, so there is possibly some other junk in my logs aside from CSS. I am running Jetty 8.1.10 and Solr 4.2.1 (stable build). Cheers! Tim Vaillancourt
dataimporter.last_index_time SolrCloud
My data-config files use the dataimporter.last_index_time variable, but it seems to have stopped working when I upgraded to 4.2. In previous 4.x versions, I saw that it was being written to zookeeper, but now there's nothing there. Did anything change? Or should I be doing something differently? Thanks! Jim -- View this message in context: http://lucene.472066.n3.nabble.com/dataimporter-last-index-time-SolrCloud-tp4055679.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: dataimporter.last_index_time SolrCloud
Same issue here. Also in the file there is multiple last index times for each entity and we cannot reference the individual anymore. DIH.entity1.last_index_time does not pass through to the query anymore. On Friday, April 12, 2013, jimtronic wrote: My data-config files use the dataimporter.last_index_time variable, but it seems to have stopped working when I upgraded to 4.2. In previous 4.x versions, I saw that it was being written to zookeeper, but now there's nothing there. Did anything change? Or should I be doing something differently? Thanks! Jim -- View this message in context: http://lucene.472066.n3.nabble.com/dataimporter-last-index-time-SolrCloud-tp4055679.html Sent from the Solr - User mailing list archive at Nabble.com. -- Bill Bell billnb...@gmail.com cell 720-256-8076
Re: Need hook to know when replication backup is actually completed.
Tim, thank you for this! I had been looking for this a while back (even posted something on serverfault) and never got a decent answer. This is exactly what I was looking for. -- Nate Fox Sr Systems Engineer o: 310.658.5775 m: 714.248.5350 Follow us @NEOGOV http://twitter.com/NEOGOV and on Facebookhttp://www.facebook.com/neogov NEOGOV http://www.neogov.com/ is among the top fastest growing software companies in the USA, recognized by Inc 500|5000, Deloitte Fast 500, and the LA Business Journal. We are hiring!http://www.neogov.com/#/company/careers On Fri, Apr 12, 2013 at 12:04 PM, Timothy Potter thelabd...@gmail.comwrote: Update to this ... did some code scanning and it looks like the backup status is available via the details command, e.g. lst name=backup str name=startTimeFri Apr 12 17:53:17 UTC 2013/str int name=fileCount120/int str name=statussuccess/str str name=snapshotCompletedAtFri Apr 12 17:58:22 UTC 2013/str /lst So with a little polling of the details command from my backup script and I'm good to go. If anyone knows of a more direct way, let me know otherwise I'm moving ahead with this approach. Cheers, Tim On Fri, Apr 12, 2013 at 9:31 AM, Timothy Potter thelabd...@gmail.com wrote: Hi, I'd like to use the backup command to create a backup of each shard leader's index periodically. This is for disaster recovery in case our data center goes offline. We use SolrCloud leader/replica for day-to-day fault-tolerance and it works great. The backup command ( http://master_host:port/solr/replication?command=backup) works just fine but it returns immediately while the actual backup creation runs in the background on the shard leader. Is there any way to know when the actual backup is complete? I need that hook to then move the backup to another storage device outside of our data center, e.g. S3. What are others doing for this type of backup process? Thanks in advance. Tim
Re: how to migrate solr 1.4 index to solr 4.2 index
Hi , have you re-indexed again or moved 1.4 index to 4.2?? Thanks. -- View this message in context: http://lucene.472066.n3.nabble.com/how-to-migrate-solr-1-4-index-to-solr-4-2-index-tp4055531p4055686.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Which tokenizer or analizer should use and field type
Unfortunately, Solr doesn't have a query parser that would give the meaning you want to: project assistant,manager For now, you would need to write that query as: (project AND assistant) OR manager Or maybe as: project assistant~5 OR manager That would require project and assistant to occur with a few words of each other. Or, if you have q.op defaulted to OR: project assistant~5 manager Add the HTML strip char filter to your text field type: charFilter class=solr.HTMLStripCharFilterFactory / text_general is a semi-decent place to start. -- Jack Krupansky -Original Message- From: anurag.jain Sent: Friday, April 12, 2013 11:32 AM To: solr-user@lucene.apache.org Subject: Which tokenizer or analizer should use and field type my schema file is : copyField source=title dest =keyword/ copyField source=body dest =keyword/ copyField source=company_name dest=keyword/ copyField source=company_profile dest=keyword/ field name=title type=text_general indexed=true stored=true/ field name=body type=text_general indexed=true stored=true/ field name=company_name type=text_general indexed=true stored=true/ field name=company_profile type=text_general indexed=true stored=true/ fieldType name=text_general class=solr.TextField positionIncrementGap=100 analyzer type=index tokenizer class=solr.StandardTokenizerFactory/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt enablePositionIncrements=true / filter class=solr.LowerCaseFilterFactory/ /analyzer analyzer type=query tokenizer class=solr.StandardTokenizerFactory/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt enablePositionIncrements=true / filter class=solr.SynonymFilterFactory synonyms=synonyms.txt ignoreCase=true expand=true/ filter class=solr.LowerCaseFilterFactory/ /analyzer /fieldType values are like, title: Assistant Coach/ Junior Assistant body: p http://i.imgur.com/buPga.jpg br /br /Oil India Ltd. invites applications for the post of strongSr Medical Officer (Paediatrics) /strongbr / www.freshersworld.combr / strongQualification/strong : MD (Paediatrics) br /br / strongNo of Post/strong : 1URbr / br /strong Pay Scale/strong : Rs 32900 -58000 br / br / strongAge as on 11.04.2013/strong : 32 yrsbr / /ppstrongSelection Procedure : /strongSelection for the above post will be based on Written Test, Group Discussion (GD), Viva-Voce and Medical Examination.br / /p company_profile: pThe story of strongOil India Limited (OIL)/strong traces and symbolises the development and growth of the Indian petroleum industry. From the discovery of crude oil in the far east of India at Digboi, Assam in 1889 to its present status as a fully integrated upstream petroleum company, OIL has come far, crossing many milestones./p, company_name: Oil India Limited, please give me suggestion about field type i should use. keyword is copyfield i am using for search. i do not want to search on html content. How search will happen ? if i give words to search project assistant,manager it only should give me keyword have project assistance or manager. right now it is giving me results which has project or assistance or manager that is wrong case for me. Please give me solution for it. I have to complete that task by today thats why i am not able to do research on it. need field type definitions for each field. and how search query i'll write ?? thanks in advance -- View this message in context: http://lucene.472066.n3.nabble.com/Which-tokenizer-or-analizer-should-use-and-field-type-tp4055591.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: CSS appearing in Solr 4.2.1 logs
: This sounds crazy, but does anyone see strange CSS/HTML in their Solr 4.2.x : logs? are you sure you're running 4.2.1 and not 4.2? https://issues.apache.org/jira/browse/SOLR-4573 -Hoss
RE: CSS appearing in Solr 4.2.1 logs
Thanks Chris! Somehow I managed to miss that ticket searching, thanks for looking for me. I will confirm the version I have and I am glad to hear this was reported and resolved! Cheers, Tim -Original Message- From: Chris Hostetter [mailto:hossman_luc...@fucit.org] Sent: Friday, April 12, 2013 2:53 PM To: solr-user@lucene.apache.org Subject: Re: CSS appearing in Solr 4.2.1 logs : This sounds crazy, but does anyone see strange CSS/HTML in their Solr 4.2.x : logs? are you sure you're running 4.2.1 and not 4.2? https://issues.apache.org/jira/browse/SOLR-4573 -Hoss
Re: Spatial search question
Outer distance AND NOT inner distance? On 04/12/2013 09:02 AM, kfdroid wrote: We currently do a radius search from a given Lat/Long point and it works great. I have a new requirement to do a search on a larger radius from the same point, but not include the smaller radius. Kind of a donut (torus) shaped search. How would I do this (Solr 4)? Search where radius is between 20km and 40km for example? Thanks, Ken -- View this message in context: http://lucene.472066.n3.nabble.com/Spatial-search-question-tp4055597.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Easier way to do this?
Bill, I responded to the issue you created about this: https://issues.apache.org/jira/browse/SOLR-4704 In summary, use {!geofilt}. ~ David Billnbell wrote I would love for the SOLR spatial 4 to support pt so that I can run # of results around a central point easily like in 3.6. How can I pass parameters to a Circle() ? I would love to send PT to this query since the pt is the same across multiple areas For example: http://localhost:8983/solr/core/select?rows=0q=*:*facet=truefacet.query={! key=.5}store_geohash:%22Intersects(Circle(26.012156,-80.311943%20d=.0072369))%22facet.query={! key=1}store_geohash:%22Intersects(Circle(26.012156,-80.311943%20d=.01447))%22facet.query={! key=5}store_geohash:%22Intersects(Circle(26.012156,-80.311943%20d=.0723))%22facet.query={! key=10}store_geohash:%22Intersects(Circle(26.012156,-80.311943%20d=.1447))%22{! key=25}facet.query=store_geohash:%22Intersects(Circle(26.012156,-80.311943%20d=.361846))%22facet.query={! key=50}store_geohash:%22Intersects(Circle(26.012156,-80.311943%20d=.72369))%22facet.query={! key=100}store_geohash:%22Intersects(Circle(26.012156,-80.311943%20d=1.447))%22 - Author: http://www.packtpub.com/apache-solr-3-enterprise-search-server/book -- View this message in context: http://lucene.472066.n3.nabble.com/Easier-way-to-do-this-tp4055474p4055732.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Support old syntax including geodist
Hi Bill, FYI see https://issues.apache.org/jira/browse/SOLR-4242 Billnbell wrote Since Spatial Lucene 4 does not seem to support geodist(), even sending d,pt,fq={!geofilt}does not help me = I need to sort. So I end up having to set up the sortsq. Any other ideas on how to support the old syntax on the new spatial? Can I create a transform or something ? A couple times or more I've looked into how geodist() works with the intention of adding support for the new spatial 4 field type but I wind up concluding the result would be a big hack because geodist() works fundamentally differently then how it would need to work, yet it would somehow have to work in two different ways. Maybe I should just accept that it's going to be an ugly hack, trading that for making things easier for users. Another thing I want to mention is that if you've got a single value'ed spatial field then I suggest using LatLonType if for nothing else but sorting, and hence you can use geodist(). Convert http://localhost:8983/solr/providersearch/select?rows=20q=*:*fq={!geofilt}pt=26.012156,-80.311943d=50sfield=store_geohashsort=geodist() asc To http://localhost:8983/solr/providersearch/select?rows=20q=*:*fq={!%20v=$geoq}sortsq={!%20score=distance%20v=$geoq}geoq=store_geohash:%22Intersects(Circle(26.012156,-80.311943%20d=.72369))%22fl=store_lat_lon,distance:mul(query($sortsq),69.09)sort=query($sortsq)%20asc I'm aware things can get ugly but can't you just use 'q' for the spatial query that turns the distance as the score both for sorting and returning it? It'd significantly simply this query. ~ David - Author: http://www.packtpub.com/apache-solr-3-enterprise-search-server/book -- View this message in context: http://lucene.472066.n3.nabble.com/Support-old-syntax-including-geodist-tp4055476p4055733.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Spatial search question
Yup, Lance is right. But it won't always work if you have multi-valued data since it wouldn't match a document that had a point both in the ring and the hole. Another approach that internally works faster and addresses the multi-value case is to implement a custom Spatial4j Shape. In this case, you could create a special aggregate Shape that basically accepts one shape and excludes the other, in its custom relate() method. It's like a subtracting shape. This is generically useful and on my list of things to do but I haven't had the need. The other step would be parsing it somehow, so you might do that by extending the existing spatial 4 field type. ~ David Lance Norskog-2 wrote Outer distance AND NOT inner distance? On 04/12/2013 09:02 AM, kfdroid wrote: We currently do a radius search from a given Lat/Long point and it works great. I have a new requirement to do a search on a larger radius from the same point, but not include the smaller radius. Kind of a donut (torus) shaped search. How would I do this (Solr 4)? Search where radius is between 20km and 40km for example? Thanks, Ken -- View this message in context: http://lucene.472066.n3.nabble.com/Spatial-search-question-tp4055597.html Sent from the Solr - User mailing list archive at Nabble.com. - Author: http://www.packtpub.com/apache-solr-3-enterprise-search-server/book -- View this message in context: http://lucene.472066.n3.nabble.com/Spatial-search-question-tp4055597p4055735.html Sent from the Solr - User mailing list archive at Nabble.com.