Re: difference these two queries

2012-12-11 Thread Mikhail Khludnev
It's worth to mention that fq is profitable only if you have high hit ratio and proper filterCache size. If you see low hit ratio you just waste resources on it. 11.12.2012 8:24 пользователь Otis Gospodnetic otis.gospodne...@gmail.com написал: If you don't need scoring on it then yes, just use

Boost docs which are posted recently

2012-12-11 Thread Sangeetha
Hi, I have a doc with the field type_s. The value can be news, photos and videos. The priority will be given in this order, photos, videos then news using the below query, q=sachindefType=dismaxbq=type_s:photos^10bq=type_s:videos^7bq=type_s:news^5 eventhough it is giving more priority to

Re: difference these two queries

2012-12-11 Thread Alexandre Rafalovitch
Would you still have performance impact if you set to no cache and just use fq to avoid impact on rating ? On Dec 11, 2012 7:00 PM, Mikhail Khludnev mkhlud...@griddynamics.com wrote: It's worth to mention that fq is profitable only if you have high hit ratio and proper filterCache size. If you

Re: difference these two queries

2012-12-11 Thread Mikhail Khludnev
My bet, but only bet, is that you'll pointlessly create garbage in your heap for bitsets that might have gc pressure, eg but is filter is quite selective you'll have rather compact SortedIntDocSet that would not be so sensible. To avoid impact on rating I can propose you boost that clause quite

Re: Wildcards and fuzzy/phonetic query

2012-12-11 Thread Ahmet Arslan
Lowercasing actually seems to work with Wildcard queries, but not with fuzzy queries.  Are there any reasons why I should experience such a difference? Hi Haagen, Yonik added this recently. https://issues.apache.org/jira/browse/SOLR-4076

Re: How to parse XML attributes with prefix using DIH?

2012-12-11 Thread Alexandre Rafalovitch
I believe DIH completely ignores names places/prefixes. Try skipping those and just use local names. On 10 Dec 2012 20:48, zhk011 zhk...@hotmail.com wrote: Hi there, I'm new to Solr and DIH, recently I've been planning to use Solr/DIH to index some local xml files. Following the DIH example

Re: Wildcards and fuzzy/phonetic query

2012-12-11 Thread Haagen Hasle
Thank you! I actually tried to look through Jira, but I didn't focus on the minor issues. For me, this is quite critical.. :-) Any chance of merging this into the 4.0.1 release? Regards, Haagen Den 11. des. 2012 kl. 12:45 skrev Ahmet Arslan: Lowercasing actually seems to work with

Index sharing between multiple slaves

2012-12-11 Thread suri
Hi, We are planning to setup multiple slaves to handle search loads. Some of these slaves will be on the same physical machine. Instead of each slave doing the replication, 1. Can we share the index with multiple slaves? All salves are read-only. 2. Can we have master and slave share the index?

Solr and nostage deployment in weblogic

2012-12-11 Thread suri
Hi, We are going to run Solr on Oracle Weblogic server. We would like to utilize Oracle Weblogic's no-stage deployment. This means, we will have webapp (Solr.war) deployed on a shared disk with multiple weblogic nodes (JVM's or Managed servers in weblogic terms) booting the same web app. Solr

modeling prices based on daterange using multipoints

2012-12-11 Thread britske
HI all, Based on some good discussion in Modeling openinghours using multipoints http://lucene.472066.n3.nabble.com/Modeling-openinghours-using-multipoints-tp4025336p4025683.html I was triggered to have a review of an old painpoint of mine: modeling pricing availability of hotels which

Partial results returned

2012-12-11 Thread adm1n
Hello, I'm running solrcloud with 2 shards. Lets assume I've 100 documents indexed in total, which are divided 55/45 by the shards... when I query, for example: curl 'http://localhost:7500/solr/index/select?q=*:*lwt=jsonindent=truerows=0' sometimes I got response:{numFound:0, sometimes -

RE: Boost docs which are posted recently

2012-12-11 Thread Swati Swoboda
Hi Sangeetha, If you need to boost based on date regardless of type, just use date boosting with a higher boost: http://wiki.apache.org/solr/SolrRelevancyFAQ#How_can_I_boost_the_score_of_newer_documents http://wiki.apache.org/solr/FunctionQuery#Date_Boosting -Original Message- From:

RE: highlighting multiple occurrences

2012-12-11 Thread Rafael Ribeiro
This I didn't knew... I have a file named buscar.vm with the important part as follows: div class=results #foreach($doc in $response.results) #parse(hit.vm) #end /div hit.vm as follows: #set($docId = $doc.getFieldValue('id')) div class=result-document #parse(doc.vm) /div

RE: highlighting multiple occurrences

2012-12-11 Thread Rafael Ribeiro
I forgot to mention that the field that I wished to have multiple occurences shown is the field named conteudo I am already trying to make it iterate but up to now with no succes... -- View this message in context:

RE: Searching for phrase

2012-12-11 Thread Swati Swoboda
It's because you are escaping. Look at this bit: [parsedquery_toString] = +(smsc_content:abcdefg12345 smsc_content:678910 smsc_description:abcdefg12345 smsc_content:678910) +smsc_lastdate:[1352627991000 TO 1386755331000] It's searching for as well because you escaped it (hence it is not

RE: highlighting multiple occurrences

2012-12-11 Thread Rafael Ribeiro
I saw this... since I didn't know that much velocity I'll try to understand but I will be really glad if (obviously in case it didn't take you much time) you point me in the direction of the changes I need to do in my files... best regards, Rafael -- View this message in context:

Re: Partial results returned

2012-12-11 Thread Per Steffensen
When you say 2 shards do you mean 2 nodes each running a shard? Seems like you have a collection named index - how did you create this collection (containing two shards)? How do you start your 2 nodes - exact command used? You might want to attach the content of clusterstate.json from ZK. While

RE: highlighting multiple occurrences

2012-12-11 Thread Rafael Ribeiro
Did it as suggested in the link I sent tks a lot! -- View this message in context: http://lucene.472066.n3.nabble.com/highlighting-multiple-occurrences-tp4025715p4026063.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Retrieving one object

2012-12-11 Thread Drone42
My mistake; I was using the embedded SOLR server. I was storing in one route and searching through another. The changes from the storage wasnt visible (file open) for the search. Changing to use the standalone server solved the problem. -- View this message in context:

Re: Index sharing between multiple slaves

2012-12-11 Thread Upayavira
What is the benefit of having two slaves on the same machine? While it is possible to share indexes across nodes, it isn't recommended, as you'll need to know what you are doing, and will be deviating from the norm which will make your life (much) harder. I really don't see how there's a benefit

Re: Partial results returned

2012-12-11 Thread adm1n
I have 1 collection called index. I created it like explained here: http://wiki.apache.org/solr/SolrCloud in Example A: Simple two shard cluster section here are the start up commands: 1)java -Dbootstrap_confdir=./solr/index/conf -Dcollection.configName=myconf -DzkRun -DnumShards=2 -jar start.jar

Re: how to remove duplicate data while facet?

2012-12-11 Thread 蒋明原
Thank you,first of all, Yes,no same unique key means no this trouble. But for me now,I can't reindex my data,it's too big.And it,s in production environment . So,any friends have solutions? Thank you . On Wednesday, December 12, 2012, Pawel wrote: I think that solution is quite obvous. Be

suggestion howto handle highly repetitive valued field

2012-12-11 Thread Jie Sun
Hi - our indexed documents currently store solr fields like 'digest' or 'type', which most of our documents will end up with same value (such as 'sha1' for field 'digest', or 'message' for field 'type' etc). on each solr server, we usually have 100 of millions of documents indexed and with the

Re: SolrCloud - ClusterState says we are the leader,but locally ...

2012-12-11 Thread Sudhakar Maddineni
Just an update on this issue: We tried by increasing zookeeper client timeout settings to 3ms in solr.xml (i think default is 15000ms), and haven't seen any issues from our tests. cores . zkClientTimeout=3 Thanks, Sudhakar. On Fri, Dec 7, 2012 at 4:55 PM, Sudhakar

A SPI class of type org.apache.lucene.codecs.Codec with name 'Lucene41' does not exist.

2012-12-11 Thread shreejay
I am getting the following error : Caused by: org.apache.solr.common.SolrException: Error opening new searcher at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:1326) at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1438) at

Too many Tika errors

2012-12-11 Thread eShard
I'm running Solr 4.0 on Tomcat 7.0.8 and I'm running the solr/example single core as well with manifoldcf v1.1 I had everything working but then the crawler stops and I have Tika errors in the solr log I had tika 1.1 and that produces these errors: org.apache.solr.common.SolrException:

Re: suggestion howto handle highly repetitive valued field

2012-12-11 Thread David Smiley (@MITRE.org)
The indexed=true side is quite efficient. The stored=true side -- not so much, but the strings you have here are pretty small and I wouldn't worry about it. Solr 4.1 (unreleased) does a great job here and compresses all the stored field data across documents. ~ David Jie Sun wrote Hi - our

RE: Which fields matched?

2012-12-11 Thread Jeff Wartes
Thanks, this is good stuff, I hadn't seen LUCENE-1999, and it even has a reference to some methods now in core. I think I'm still stuck for the moment though. I'm pretty fixed on Solr 3.5 for the next few development cycles, and I've been trying really hard to avoid compiling my own Solr - as

Re: SolrCloud - Query performance degrades with multiple servers

2012-12-11 Thread sausarkar
Do you know when will 4.1 be released or will there be a 4.0.1 release with bug fixes from 4.0? Thanks -- View this message in context: http://lucene.472066.n3.nabble.com/SolrCloud-Query-performance-degrades-with-multiple-servers-tp4024660p4026139.html Sent from the Solr - User mailing list

spatial searches and geo-json data

2012-12-11 Thread solr-user
hi all. I have a large amount of spatial data in geo-json format that I get from mssql server. I want to be able to index that data and am trying to figure out how to convert the data into WKT format since solr only accepts WKT. is anyone away of any solr module or tsql code or c# code that

Highlighting data stored outside of Solr

2012-12-11 Thread Michael Ryan
Has anyone ever attempted to highlight a field that is not stored in Solr? We have been considering not storing fields in Solr, but still would like to use Solr's built-in highlighting. On first glance, it looks like it would be fairly simply to modify DefaultSolrHighlighter to get the stored

Re: modeling prices based on daterange using multipoints

2012-12-11 Thread David Smiley (@MITRE.org)
Hi Britske, This is a very interesting question! britske wrote ... I realize the new spatial-stuff in Solr 4 is no magic bullet, but I'm wondering if I could model multiple prices per day as multipoints, whereas: - date*duration*nr of persons*roomtype is modeled as point.x

SolrJ/Solr version mismatch error

2012-12-11 Thread Sean Timm
I ran into this today it took me longer than it should have to figure out the problem, so I wanted to write and share my experience to save someone else some time. A web search and a search through the mail archives didn't provide any elucidation. If you run SolrJ 4.0.0 BETA connecting to

dataimport.properties not created/updated with solrcloud

2012-12-11 Thread adm1n
Hi, I have a problem with updating dataimport.properties - while running single sold there is no problem at all. Everything works perfectly. But when I switching to cloud configuration with 2 shards (like described in http://wiki.apache.org/solr/SolrCloud ExampleA: Simple two shard cluster) this

Re: suggestion howto handle highly repetitive valued field

2012-12-11 Thread Jie Sun
thank you David! -- View this message in context: http://lucene.472066.n3.nabble.com/suggestion-howto-handle-highly-repetitive-valued-field-tp4026104p4026163.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: SolrCloud - Query performance degrades with multiple servers

2012-12-11 Thread Yonik Seeley
On Thu, Dec 6, 2012 at 8:08 PM, sausarkar sausar...@ebay.com wrote: Ok we think we found out the issue here. When solrcloud is started without specifying numShards argument solrcloud starts with a single shard but still thinks that there are multiple shards, so it forwards every single query to

Re: SolrCloud - Query performance degrades with multiple servers

2012-12-11 Thread Mark Miller
I'm still looking into this - didn't have a lot of luck seeing it with a test and am going to look at it manually. I'm hoping 4.1 by xmas! We will see though...need to get others on board. - Mark On Tue, Dec 11, 2012 at 2:40 PM, sausarkar sausar...@ebay.com wrote: Do you know when will 4.1 be

Re: modeling prices based on daterange using multipoints

2012-12-11 Thread britske
Hi David, Yeah interesting (as well as problematic as far is implementing) use-case indeed :) 1. You mention there are no special caches / memory requirements inherent in this.. For a given user-query this would mean all hotels would have to seach for all point.x each time right? What would be a

Re: SolrCloud - Query performance degrades with multiple servers

2012-12-11 Thread Yonik Seeley
OK, I tried to reproduce it on trunk, and I can't (i.e. everything is looking fine). rm -rf example/solr/zoo_data cp -rp example example2 cp -rp example example3 cd example java -Dbootstrap_confdir=./solr/collection1/conf -Dcollection.configName=myconf -DzkRun -DnumShards=1 -jar start.jar cd

Re: Too many Tika errors

2012-12-11 Thread Mattmann, Chris A (388J)
Hi there -- you may want to post this to the d...@tika.apache.org list. Cheers, Chris On 12/11/12 11:08 AM, eShard zim...@yahoo.com wrote: I'm running Solr 4.0 on Tomcat 7.0.8 and I'm running the solr/example single core as well with manifoldcf v1.1 I had everything working but then the crawler

Rolling Deploys and SolrCloud

2012-12-11 Thread Mike Schultz
Does anybody have any experience with rolling deployments and SolrCloud? We have a production environment where we deploy new software and config simultaneously to individual servers in a rolling manner. At any point during the deployment, there may be N boxes with old software/config and M

Re: Rolling Deploys and SolrCloud

2012-12-11 Thread Mark Miller
Doing this with SolrCloud is not much different than doing it with old style Solr. ZooKeeper supports rolling restarts, and AFAIK, so does Solr generally. While the configs live in zk, they work the same way as if they were local. A SolrCore won't try and read them until you reload it. I think

Re: Highlighting data stored outside of Solr

2012-12-11 Thread Otis Gospodnetic
I don't recall anyone implementing this, but I know it's been discussed in the past, so check the ML archives. Otis -- SOLR Performance Monitoring - http://sematext.com/spm On Dec 11, 2012 2:49 PM, Michael Ryan mr...@moreover.com wrote: Has anyone ever attempted to highlight a field that is

Re: Update multiple documents

2012-12-11 Thread Otis Gospodnetic
But is that the best approach? If you use personIds in your second index then you don't have to did that. Maybe you are after joins in Solr? Otis -- SOLR Performance Monitoring - http://sematext.com/spm On Dec 11, 2012 1:21 PM, Dikchant Sahi contacts...@gmail.com wrote: Hi, We have two set

Re: how to remove duplicate data while facet?

2012-12-11 Thread Otis Gospodnetic
Hi, Sounds like you don't need to reindex. You need to find duplicates and delete them. Otis -- SOLR Performance Monitoring - http://sematext.com/spm On Dec 11, 2012 12:42 PM, 蒋明原 mailtojiangmingy...@gmail.com wrote: Thank you,first of all, Yes,no same unique key means no this trouble. But

Re: Too many Tika errors

2012-12-11 Thread Jack Krupansky
What's going on here? What version of tika should I use? The version that comes with Solr/SolrCell. Try sending various document types directly to the Solr Extracting Request Handler and see if it might be related to your parameters or specific document types. Maybe the document isn't what

Re: Update multiple documents

2012-12-11 Thread Dikchant Sahi
My intention is to allow search on person names in the second index also. If we use personId in the second index, is there a way to achieve that? Yes, we are looking for join kind of feature. Thanks! On Wed, Dec 12, 2012 at 8:31 AM, Otis Gospodnetic otis.gospodne...@gmail.com wrote: But is

Re: modeling prices based on daterange using multipoints

2012-12-11 Thread David Smiley (@MITRE.org)
britske wrote Hi David, Yeah interesting (as well as problematic as far is implementing) use-case indeed :) 1. You mention there are no special caches / memory requirements inherent in this.. For a given user-query this would mean all hotels would have to seach for all point.x each time