Re: Use case for storing positions and offsets in index?

2013-05-09 Thread Jack Krupansky
Term positions in the index are used for phrase query and span queries. There is a separate concept called term vectors that maintains positions as well. It is most useful for highlighting - you want to know exactly where a term started and ended. -- Jack Krupansky -Original

Re: Use case for storing positions and offsets in index?

2013-05-09 Thread Jason Hellman
Consider further that term vector data and highlighting becomes very useful if you highlight externally to Solr. That is to say, you have the data stored externally and wish to re-parse positions of terms (especially synonyms) from source material. This is a (not too uncommon) technique used

Portability of Solr index

2013-05-09 Thread mukesh katariya
I have built a SOLR Index on Windows 7 Enterprise, 64 Bit. I copy the index to Centos release 6.2, 32 Bit OS. The index is readable and the application is able to load data from the index on Linux. But there are a few fields on which FQ Queries dont work on Linux , but same FQ Query work on

Need solr query help

2013-05-09 Thread Abhishek tiwari
We are doing spatial search. with following logic. a) There are shops in a city . Each provides the facility of home delivery b) each shop has different max_delivery_distance . Now my query is suppose some one is searching from point P1 with radius R. User wants the result of shops those can

More Like This and Caching

2013-05-09 Thread Giammarco Schisani
Hi all, Could anybody explain which Solr cache (e.g. queryResultCache, documentCache, fieldCache, etc.) can be used by the More Like This handler? One of my colleagues had previously suggested that the More Like This handler does not take advantage of any of the Solr caches. However, if I issue

Re: Re: Re: Re: Shard update error when using DIH

2013-05-09 Thread heaven
Thank you all, guys. Your advises work great and I don't see any errors in Solr logs anymore. Best, Alex Monday 29 April 2013, you wrote: On 29 April 2013 14:55, heaven [hidden email][1] wrote: Got these errors after switching the field type to long: * *crm-test:*

filter result by numFound in Result Grouping

2013-05-09 Thread Shalom Ben-Zvi Kazaz
Hello list In one of our search that we use Result Grouping we have a need to filter results to only groups that have more then one document in the group, or more specifically to groups that have two documents. Is it possible in some way? Thank you

Re: Solr 4.3 fails in startup when dataimporthandler declaration is included in solrconfig.xml

2013-05-09 Thread Jan Høydahl
My question was: When you move DIH libs to Solr's classloader (e.g. instanceDir/lib and refer from solrconfig.xml), and remove solr.war from tomcat/lib, what error msg do you then get? Also make sure to delete the old tomcat/webapps/solr folder just to make sure you're starting from scratch

Portability of Solr index

2013-05-09 Thread mukesh katariya
I have built a SOLR Index on Windows 7 Enterprise, 64 Bit. I copy the index to Centos release 6.2, 32 Bit OS. The index is readable and the application is able to load data from the index on Linux. But there are a few fields on which FQ Queries dont work on Linux , but same FQ Query work on

ColrCloud: IOException occured when talking to server at

2013-05-09 Thread heaven
Hi, observing lots of these errors with SolrCloud Here is the instruction I am using to run services: zookeeper: 1: cd /opt/zookeeper/ 2: sudo bin/zkServer.sh start zoo1.cfg 3: sudo bin/zkServer.sh start zoo2.cfg 4: sudo bin/zkServer.sh start zoo3.cfg shards: 1: cd

Re: Solr 4.2 rollback not working

2013-05-09 Thread mark12345
So for all current versions of Solr, rollback will not work for SolrCloud? Will this change in the future, or will rollback always be unsupported for SolrCloud? This did catch me by surprise. Should the SolrJ documentation be updated to reflect this behavior?

Re: ColrCloud: IOException occured when talking to server at

2013-05-09 Thread heaven
Forget to mention Solr is 4.2 and zookepeer 3.4.5 I do not do manual commits and prefer softCommit each second and autoCommit each 3 minutes. the problem happened again, lots of errors in logs and no description. Cluster state changed, on the shard 2 replica became a leader, former leader get in

Re: ColrCloud: IOException occured when talking to server at

2013-05-09 Thread heaven
Zookeeper log: 1 *2013-05-09 03:03:07,177* [myid:3] - WARN [QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2183:Follower@118] - Got zxid 0x20001 expected 0x1 2 *2013-05-09 03:36:52,918* [myid:3] - ERROR [CommitProcessor:3:NIOServerCnxn@180] - Unexpected Exception: 3

Re: Solr 4.2 rollback not working

2013-05-09 Thread Mark Miller
At the least it should throw an exception if you try rollback with SolrCloud - though now there is discussion about removing it entirely. But yes, it's not supported and there are no real plans to support it. - Mark On May 9, 2013, at 7:21 AM, mark12345 marks1900-pos...@yahoo.com.au wrote:

Re: ColrCloud: IOException occured when talking to server at

2013-05-09 Thread heaven
Can confirm this lead to data loss. I have 1217427 records in database and only 1217216 indexed. Which does mean that Solr gave a successful response and then did not added some documents to the index. Seems like SolrCloud is not a production-ready solution, would be good if there was a warning

Re: Solr 4.3 fails in startup when dataimporthandler declaration is included in solrconfig.xml

2013-05-09 Thread William Pierce
I got this to work (thanks, Jan, and all). It turns out that DIH jars need to be included explicitly by specifying in solrconfig.xml or placed in some default path under solr.home. I placed these jars in instanceDir/lib and it worked. Previously I had reported it as not working - this was

Re: Portability of Solr index

2013-05-09 Thread Alexandre Rafalovitch
What is the query/term you are looking for? I wonder if the difference is due to newline treatment on different platforms. Regards, Alex. Personal blog: http://blog.outerthoughts.com/ LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch - Time is the quality of nature that keeps events

Fuzzy searching documents over multiple fields using Solr

2013-05-09 Thread britske
Not sure if this has ever come up (or perhaps even implemented without me knowing) , but I'm interested in doing Fuzzy search over multiple fields using Solr. What I mean is the ability to returns documents based on some 'distance calculation' without documents having to match 100% to the query.

Re: Fuzzy searching documents over multiple fields using Solr

2013-05-09 Thread Jack Krupansky
A simple OR boolean query will boost documents that have more matches. You can also selectively boost individual OR terms to control importance. And do and AND for the required terms, like tv. -- Jack Krupansky -Original Message- From: britske Sent: Thursday, May 09, 2013 11:21 AM

4.3 logging setup

2013-05-09 Thread richardg
On all prior index version I setup my log via the logging.properties file in /usr/local/tomcat/conf, it looked like this: # Licensed to the Apache Software Foundation (ASF) under one or more # contributor license agreements. See the NOTICE file distributed with # this work for additional

Re: Fuzzy searching documents over multiple fields using Solr

2013-05-09 Thread Geert-Jan Brits
I didn't mention it but I'd like individual fields to contribute to the overall score on a continuum instead of 1 (match) and 0 (no match), which will lead to more fine-grained scoring. A contrived example: all other things equal a tv of 40 inch should score higher than a 38 inch tv when

Re: Use case for storing positions and offsets in index?

2013-05-09 Thread KnightRider
Thanks Jack Jason - Thanks -K'Rider -- View this message in context: http://lucene.472066.n3.nabble.com/Use-case-for-storing-positions-and-offsets-in-index-tp4061376p4061890.html Sent from the Solr - User mailing list archive at Nabble.com.

Grouping search results by field returning all search results for a given query

2013-05-09 Thread Luis Carlos Guerrero Covo
Hi, I'm using solr to maintain an index of items that belong to different companies. I want the search results to be returned in a way that is fair to all companies, thus I wish to group the results such that each company has 1 item in each group, and the groups of results should be returned

Re: Fuzzy searching documents over multiple fields using Solr

2013-05-09 Thread Jack Krupansky
You can use function queries to boost documents as well. Sorry, but it can get messy to figure out. See: http://wiki.apache.org/solr/FunctionQuery See also the edismax bf parameter: http://wiki.apache.org/solr/ExtendedDisMax#bf_.28Boost_Function.2C_additive.29 -- Jack Krupansky -Original

Re: Grouping search results by field returning all search results for a given query

2013-05-09 Thread Jason Hellman
Luis, I am presuming you do not have an overarching grouping value here…and simply wish to show a standard search result that shows 1 item per company. You should be able to accomplish your second page of desired items (the second item from each of your 20 represented companies) by using the

RE: More Like This and Caching

2013-05-09 Thread David Parks
I'm not the expert here, but perhaps what you're noticing is actually the OS's disk cache. The actual solr index isn't cached by solr, but as you read the blocks off disk the OS disk cache probably did cache those blocks for you. On the 2nd run the index blocks were read out of memory. There was

Re: 4.3 logging setup

2013-05-09 Thread Jason Hellman
From: http://lucene.apache.org/solr/4_3_0/changes/Changes.html#4.3.0.upgrading_from_solr_4.2.0 Slf4j/logging jars are no longer included in the Solr webapp. All logging jars are now in example/lib/ext. Changing logging impls is now as easy as updating the jars in this folder with those

Re: More Like This and Caching

2013-05-09 Thread Jason Hellman
Purely from empirical observation, both the DocumentCache and QueryResultCache are being populated and reused in reloads of a simple MLT search. You can see in the cache inserts how much extra-curricular activity is happening to populate the MLT data by how many inserts and lookups occur on

Re: 4.3 logging setup

2013-05-09 Thread richardg
Thanks for responding. My issue is I've never changed anything w/ logging, I have always used the built in Juli. I've never messed w/ any jar files, just had edit the logging.properties file. I don't know where I would get the jars for juli or where to put them, if that is what is needed. I

RE: Invalid version (expected 2, but 60) or the data in not in 'javabin' format

2013-05-09 Thread Sergiu Bivol
I have a similar problem. With 5 shards, querying 500K rows fails, but 400K is fine. Querying individual shards for 1.5 million rows works. All solr instances are v4.2.1 and running on separate Ubuntu VMs. It is not random, can be always reproduced by adding rows=50 to a query where numFound

Re: 4.3 logging setup

2013-05-09 Thread Jason Hellman
If you nab the jars in example/lib/ext and place them within the appropriate folder in Tomcat (and this will somewhat depend on which version of Tomcat you are using…let's presume tomcat/lib as a brute-force approach) you should be back in business. On May 9, 2013, at 11:41 AM, richardg

Re: 4.3 logging setup

2013-05-09 Thread Jan Høydahl
Hi, FIrst of all, to setup loggin using Log4J (which is really better than JULI), copy all the jars from Jetty's lib/ext over to tomcat's lib folder, see instructions here: http://wiki.apache.org/solr/SolrLogging#Solr_4.3_and_above. You can place your log4j.properties in tomcat/lib as well so

RE: Invalid version (expected 2, but 60) or the data in not in 'javabin' format

2013-05-09 Thread Sergiu Bivol
Adding the original message. Thank you Sergiu -Original Message- From: Sergiu Bivol [mailto:sbi...@blackberry.com] Sent: Thursday, May 09, 2013 2:50 PM To: solr-user@lucene.apache.org Subject: RE: Invalid version (expected 2, but 60) or the data in not in 'javabin' format I have a

Re: 4.3 logging setup

2013-05-09 Thread richardg
I had already copied those jars over and gotten the app to start(it wouldn't without them). I was able configure solf4j/log4j logging using the log4j.properties in the /lib folder to start logging but I don't want to switch. I have alerts set on the wording that the juli logging puts out but

Re: 4.3 logging setup

2013-05-09 Thread Shawn Heisey
On 5/9/2013 12:54 PM, Jason Hellman wrote: If you nab the jars in example/lib/ext and place them within the appropriate folder in Tomcat (and this will somewhat depend on which version of Tomcat you are using…let's presume tomcat/lib as a brute-force approach) you should be back in business.

Re: Grouping search results by field returning all search results for a given query

2013-05-09 Thread Luis Carlos Guerrero Covo
Thank you for the prompt reply jason. The group.offset parameter is working for me, now I can iterate through all items for each company. The problem I'm having right now is pagination. Is there a way how this can be implemented out of the box with solr? Before I was using the group.main=true for

Re: Grouping search results by field returning all search results for a given query

2013-05-09 Thread Jason Hellman
I would think pagination is resolved by obtaining the numFound value for your returned groups. If you have numFound=6 then each page of 20 items (one item per company) would imply a total of 6 pages. You'll have to arbitrate for the variance here…but it would seem to me you need as many pages

Re: 4.3 logging setup

2013-05-09 Thread richardg
These are the files I have in my /lib folder: slf4j-api-1.6.6 log4j-1.2.16 jul-to-slf4j-1.6.6 jcl-over-slf4j-1.6.6 slf4j-jdk14-1.6.6 log4j-over-slf4j-1.6.6 Currently everything seems to be logging like before. After I followed the instructions in Jan's post replacing slf4j-log4j12-1.6.6.jar

Does Distributed Search are Cached Only the By Node That Runs Query?

2013-05-09 Thread Furkan KAMACI
I have Solr 4.2.1 and run them as SolrCloud. When I do a search on SolrCloud as like that: ip_of_node_1:8983solr/select?q=*:*rows=1 and when I check admin page I see that: I have 5 GB Java Heap. 616.32 MB is dark gray, 3.13 GB is gray. Before my search it was something like: 150 MB dark

Re: 4.3 logging setup

2013-05-09 Thread Shawn Heisey
On 5/9/2013 1:41 PM, richardg wrote: These are the files I have in my /lib folder: slf4j-api-1.6.6 log4j-1.2.16 jul-to-slf4j-1.6.6 jcl-over-slf4j-1.6.6 slf4j-jdk14-1.6.6 log4j-over-slf4j-1.6.6 Currently everything seems to be logging like before. After I followed the instructions in Jan's

Is the CoreAdmin RENAME method atomic?

2013-05-09 Thread Lan
We need to implement a locking mechanism for a full-reindexing SOLR server pool. We could use a database, Zookeeper as our locking mechanism but thats a lot of work. Could solr do it? I noticed the core admin RENAME function (http://wiki.apache.org/solr/CoreAdmin#RENAME) Is this an synchronous

Re: Frequent OOM - (Unknown source in logs).

2013-05-09 Thread shreejay
We ended up using a Solr 4.0 (now 4.2) without the cloud option. And it seems to be holding good. -- View this message in context: http://lucene.472066.n3.nabble.com/Frequent-OOM-Unknown-source-in-logs-tp4029361p4061945.html Sent from the Solr - User mailing list archive at Nabble.com.

SolrCloud Sorting Results By Relevance

2013-05-09 Thread Furkan KAMACI
When I make a search at Solr 4.2.1 that runs as SolrCloud I get: result name=response numFound=18720 start=0 maxScore=1.2672108 First one has that boost: float name=boost 1.3693064 /float Second one has that: float name=boost 1.7501166 /float and third one: float name=boost 1.0387472 /float

Apache Whirr for SolrCloud with external Zookeeper

2013-05-09 Thread Furkan KAMACI
Hi Folks; I have tested Solr 4.2.1 as SolrCloud and I think to use 4.3.1 when it is ready at my pre-production environment. I want to learn that does anybody uses Apache Whirr for SolrCloud with external Zookeeper ensemble? What folks are using for such kind of purposes?

Status of EDisMax

2013-05-09 Thread André Widhani
Hi, what is the current status of the Extended DisMax Query Parser? The release notes for Solr 3.1 say it was experimental at that time (two years back). The current wiki page for EDisMax does not contain any such statement. We recently ran into the issue described in SOLR-2649 (using

Negative Boosting at Recent Versions of Solr?

2013-05-09 Thread Furkan KAMACI
I know that whilst Lucene allows negative boosts, Solr does not. However did it change with newer versions of Solr (I use Solr 4.2.1) or still same?

Re: Apache Whirr for SolrCloud with external Zookeeper

2013-05-09 Thread Otis Gospodnetic
I've never encountered anyone using Whirr to launch Solr even though that's possible - http://issues.apache.org/jira/browse/WHIRR-465 Otis -- Solr ElasticSearch Support http://sematext.com/ On Thu, May 9, 2013 at 5:28 PM, Furkan KAMACI furkankam...@gmail.com wrote: Hi Folks; I have

Re: Apache Whirr for SolrCloud with external Zookeeper

2013-05-09 Thread Furkan KAMACI
I saw that ticket and wanted to ask it to mail list. I want to give it a try and feedback to mail list. What folks use for such kind of purposes? 2013/5/10 Otis Gospodnetic otis.gospodne...@gmail.com I've never encountered anyone using Whirr to launch Solr even though that's possible -

Re: Negative Boosting at Recent Versions of Solr?

2013-05-09 Thread Jack Krupansky
Solr does support both additive and multiplicative boosts. Although Solr doesn't support negative multiplicative boosts on query terms, it does support fractional multiplicative boosts (0.25) which do allow you to de-boost a term. The boosts for individual query terms and for the edismax qf

Re: Index compatibility between Solr releases.

2013-05-09 Thread Erick Erickson
Solr strives to keep backwards-compatible 1 major revision, so 4.x should be able to work with 3.x indexes. One caution though, well actually two. 1 If you have a master/slave setup, upgrade the _slaves_ first. If you upgrade a master fist and it merges segments, then the slaves won't be able to

Re: Index corrupted detection from http get command.

2013-05-09 Thread Erick Erickson
There's no way to do this that I know of. There's the checkindex tool, but it's fairly expensive resource-wise and there's no HTTP command to do it. Best Erick On Tue, May 7, 2013 at 8:04 PM, Michel Dion diom...@gmail.com wrote: Hello, I'm look for a way to detect solr index corruption

Re: transientCacheSize doesn't seem to have any effect, except on startup

2013-05-09 Thread Erick Erickson
I'm slammed with stuff and have to leave for vacation Saturday morning so I'll be going silent for a while, sorry Best Erick On Wed, May 8, 2013 at 11:27 AM, didier deshommes dfdes...@gmail.com wrote: Any idea on this? I still cannot get the combination of transient cores and

Re: Apache Whirr for SolrCloud with external Zookeeper

2013-05-09 Thread Otis Gospodnetic
Great, let us know how it works for you. Blog post? Otis Solr ElasticSearch Support http://sematext.com/ On May 9, 2013 6:30 PM, Furkan KAMACI furkankam...@gmail.com wrote: I saw that ticket and wanted to ask it to mail list. I want to give it a try and feedback to mail list. What folks use

Re: SolrCloud: IOException occured when talking to server at

2013-05-09 Thread Shawn Heisey
On 5/9/2013 7:31 AM, heaven wrote: Can confirm this lead to data loss. I have 1217427 records in database and only 1217216 indexed. Which does mean that Solr gave a successful response and then did not added some documents to the index. Seems like SolrCloud is not a production-ready

Re: SolrCloud Sorting Results By Relevance

2013-05-09 Thread Otis Gospodnetic
Hits are sorted by relevance score by default. You are listing boost. Otis Solr ElasticSearch Support http://sematext.com/ On May 9, 2013 5:16 PM, Furkan KAMACI furkankam...@gmail.com wrote: When I make a search at Solr 4.2.1 that runs as SolrCloud I get: result name=response numFound=18720

Re: Status of EDisMax

2013-05-09 Thread Otis Gospodnetic
Didn't check that issue, but edismax is not experimental any more - most solr users use it. Otis Solr ElasticSearch Support http://sematext.com/ On May 9, 2013 5:36 PM, André Widhani andre.widh...@digicol.de wrote: Hi, what is the current status of the Extended DisMax Query Parser? The

Re: Does Distributed Search are Cached Only the By Node That Runs Query?

2013-05-09 Thread Otis Gospodnetic
You are looking at jvm heap but attributing it to caching only. Not quite right...there are other things in that jvm heap. Otis Solr ElasticSearch Support http://sematext.com/ On May 9, 2013 3:55 PM, Furkan KAMACI furkankam...@gmail.com wrote: I have Solr 4.2.1 and run them as SolrCloud. When

Re: More Like This and Caching

2013-05-09 Thread Otis Gospodnetic
This is correct, doc cache for previously read docs regardless of which query read them and query cache for repeat query. Plus OS cache for actual index files. Otis Solr ElasticSearch Support http://sematext.com/ On May 9, 2013 2:32 PM, Jason Hellman jhell...@innoventsolutions.com wrote:

Re: Per Shard Replication Factor

2013-05-09 Thread Otis Gospodnetic
Could these just be different collections? Then sharding and replication is independent. And you can reduce replication factor as the index ages. Otis Solr ElasticSearch Support http://sematext.com/ On May 9, 2013 1:43 AM, Steven Bower smb-apa...@alcyon.net wrote: Is it currently possible to

Re: 4.3 logging setup

2013-05-09 Thread Jan Høydahl
I've updated the WIKI: http://wiki.apache.org/solr/SolrLogging#Switching_from_Log4J_logging_back_to_Java-util_logging -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com 9. mai 2013 kl. 21:57 skrev Shawn Heisey s...@elyograg.org: On 5/9/2013 1:41 PM, richardg wrote:

Re: SOLR Error: Document is missing mandatory uniqueKey field

2013-05-09 Thread zaheer.java
Here is the stack trace: DEBUG - 2013-05-09 18:53:06.411; org.apache.solr.update.processor.LogUpdateProcessor; PRE_UPDATE add{,id=(null)} {wt=javabinversion=2} DEBUG - 2013-05-09 18:53:06.411; org.apache.solr.update.processor.LogUpdateProcessor; PRE_UPDATE FINISH {wt=javabinversion=2} INFO -

SOLR Error: Document is missing mandatory uniqueKey field

2013-05-09 Thread zaheer.java
I repeatedly get this error while adding documents to SOLR using SOLRJ Document is missing mandatory uniqueKey field: orderItemKey. This field is defined as uniqueKey in the Document Schema. I've made sure that I'm passing this field from Java by logging it upfront. As suggested somwhere, I've

Re: dataimport handler

2013-05-09 Thread William Bell
It does not work anymore in 4.x. ${dih.last_index_time} does work, but the entity version does not. Bill On Tue, May 7, 2013 at 4:19 PM, Shalin Shekhar Mangar shalinman...@gmail.com wrote: Using ${dih.entity_name.last_index_time} should work. Make sure you put it in quotes in your query.

Re: SOLR Error: Document is missing mandatory uniqueKey field

2013-05-09 Thread Shawn Heisey
On 5/9/2013 7:44 PM, zaheer.java wrote: I repeatedly get this error while adding documents to SOLR using SOLRJ Document is missing mandatory uniqueKey field: orderItemKey. This field is defined as uniqueKey in the Document Schema. I've made sure that I'm passing this field from Java by

Re: SOLR guidance required

2013-05-09 Thread Shawn Heisey
On 5/9/2013 9:41 PM, Kamal Palei wrote: I hope there must be some mechanism, by which I can associate salary, experience, age etc with resume document during indexing. And when I search for resumes I can give all filters accordingly and can retrieve 100 records and strait way I can show 100

RE: Is the CoreAdmin RENAME method atomic?

2013-05-09 Thread David Parks
Find the discussion titled Indexing off the production servers just a week ago in this same forum, there is a significant discussion of this feature that you will probably want to review. -Original Message- From: Lan [mailto:dung@gmail.com] Sent: Friday, May 10, 2013 3:42 AM To: