Re: Hotel Searches

2013-01-09 Thread Upayavira
It seems to me like you want to use result grouping by hotel. You'll have to add up the tariffs for each hotel, but that isn't hard. Upayavira On Wed, Jan 9, 2013, at 06:08 AM, Harshvardhan Ojha wrote: Hi Alex, Thanks for your reply. I saw prices based on daterange using multipoints . But

Re: Hotel Searches

2013-01-09 Thread Uwe Reh
Hi, maybe I'm thinking too simple again. Nevertheless, here an idea to solve the question. The basic thought is to get rid of the range query. Have: - a textfield 'vacant_days'. Instead of ISO-Dates just simple dates in the form mmdd - a dynamic field 'price_*', You can add the tariff for

Solr + Munin, a good plugin?

2013-01-09 Thread Bruno Mannina
Dear Solr Users, Does anyone have a plugin to scan the number of request (/select) by hour/day/week/Month/Year I try to use the plugin solr_qps but it's not really good. Thanks a lot, Bruno

[OFFER] Consulting job with search specialists based in Cambridge UK

2013-01-09 Thread Charlie Hull
Hi all, Hope you don't mind me cluttering up the list with a job offer. We're a team of search specialists based in the UK and we're hiring: http://www.flax.co.uk/hiring/ We're ideally looking for someone with experience of Apache Lucene/Solr development, able to work on a flexible contract

performance improvements on ip look up query

2013-01-09 Thread Lee Carroll
Hi We are doing a lat/lon look up query using ip address. We have a 6.5 million document core of the following structure start ip block end ip block location id location_lat_lon the field defs are types fieldType name=string class=solr.StrField sortMissingLast=true omitNorms=true/ fieldType

Re: fieldtype for name

2013-01-09 Thread Michael Jones
Thanks. It isn't necessarily the need to match 'dick' to 'robert' but to search for: 'name surname' name, surname' 'surname name' 'surname, name' And nothing else, I don't need to worry about nick names or abbreviations of a name, just the above variations. I think I might use text_ws. On Tue,

Re: fieldtype for name

2013-01-09 Thread Michael Jones
Also. I'm allowing users to do enter a name with quotes to search for an exact name. So at the moment only smith, robert will return any results where *robert smith* will return all variations including 'smith, herbert' On Wed, Jan 9, 2013 at 11:09 AM, Michael Jones michaelj...@gmail.comwrote:

Highlighting: When alternateField does not exist

2013-01-09 Thread Jan Høydahl
Hi, The alternateField and maxAlternateFieldLength params work well, but only as long as the alternate field actually exists for the document. If it does not, highlighting returns nothing. We would like this behavior 1. Highlighting in body if matches 2. Fallback to verbatim teaser if it

RE: Hotel Searches

2013-01-09 Thread Harshvardhan Ojha
Hi Uwe, Thanks for your reply. I think this will solve my problem. Regards Harshvardhan Ojha -Original Message- From: Uwe Reh [mailto:r...@hebis.uni-frankfurt.de] Sent: Wednesday, January 09, 2013 2:52 PM To: solr-user@lucene.apache.org Subject: Re: Hotel Searches Hi, maybe I'm

Re: fieldtype for name

2013-01-09 Thread Otis Gospodnetic
Hi, Without seeing the configs I would guess default query operator might be OR (and check docs for mm parameter on the Wiki) or there are ngrams involved. Former is more likely. Otis Solr ElasticSearch Support http://sematext.com/ On Jan 9, 2013 6:16 AM, Michael Jones michaelj...@gmail.com

Re: wildcard faceting in solr cloud

2013-01-09 Thread jmozah
I am testing it.. and i will upload it after that.. ./Zahoor HBase Musings On 09-Jan-2013, at 2:55 AM, Upayavira u...@odoko.co.uk wrote: Have you uploaded a patch to JIRA??? Upayavira On Tue, Jan 8, 2013, at 07:57 PM, jmozah wrote: Hmm. Fixed it. Did similar thing as SOLR-247 for

RE: Highlighting: When alternateField does not exist

2013-01-09 Thread Markus Jelsma
Hi , That should be fairly easy to make in alternateField() in DefaultSolrHighlighter. We made a small change there to support globs in alternateField. Cheers, -Original message- From:Jan Høydahl jan@cominvent.com Sent: Wed 09-Jan-2013 12:44 To: solr-user@lucene.apache.org

Re: fieldtype for name

2013-01-09 Thread Michael Jones
Hi, My schema file is here http://pastebin.com/ArY7xVUJ Query (name:'ian paisley') returns ~ 3000 results Query (name:'paisley, ian') returns ~ 250 results - That is how the name is stored, so is returning just the results with that person. I need all variations to return 250 results Query

Re: fieldtype for name

2013-01-09 Thread Upayavira
Try q=name:(ian paisley)q.op=AND Does that work better for you? It would also match Ian James Paisley, but not Ian Jackson. Upayavira On Wed, Jan 9, 2013, at 01:30 PM, Michael Jones wrote: Hi, My schema file is here http://pastebin.com/ArY7xVUJ Query (name:'ian paisley') returns ~ 3000

Restore hot backup

2013-01-09 Thread marotosg
Hi, Is possible to restore an old backup without shutting down Solr? Regards, Sergio -- View this message in context: http://lucene.472066.n3.nabble.com/Restore-hot-backup-tp4031866.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: fieldtype for name

2013-01-09 Thread Michael Jones
Brilliant! Thank you! On Wed, Jan 9, 2013 at 1:37 PM, Upayavira u...@odoko.co.uk wrote: q=name:(ian paisley)q.op=AND

Performance issue with group.ngroups=true

2013-01-09 Thread Mickael Magniez
Hi, I have a performance issue with group.ngroups=true parameters. I have an index with 100k documents (small documents, 1-10 documents per group, group on string field), if i make a q=*:*...group.ngroups=true i have 4s responsetime vs 50ms without the ngroups parameters. Is it a workaround for

CoreAdmin STATUS performance

2013-01-09 Thread Shahar Davidson
Hi All, I have a client app that uses SolrJ and which requires to collect the names (and just the names) of all loaded cores. I have about 380 Solr Cores on a single Solr server (net indices size is about 220GB). Running the STATUS action takes about 800ms - that seems a bit too long, given

massive memory consumption of grouping feature

2013-01-09 Thread clawu01
Hello, we are upgrading solr from 1.3 to 4.0. In solr 1.3 we used the SOLR-236 patch to realize grouping/ field collapsing. We did not have a memory issue with the field collapsing feature in our 1.3 version. However, we do now. The query looks something like this:

Re: Clean Up Aged Index Using DeletionPolicy

2013-01-09 Thread hyrax
Hey Shawn, Thanks a lot for your detailed explanation on deletionPolicy. Although it's frustrated that Solr doesn't support the function I need, I'm really glad that you point it out so that I can move on. What I'm thinking now is adding a new field for the time a document is indexed, so a simple

SolrCloud - shard distribution

2013-01-09 Thread James Thomas
Hi, Simple question, I hope. Using the nightly build of 4.1 from yesterday (Jan 8, 2013), I started 6 Solr nodes. I issued the following command to create a collection with 3 shards, and a replication factor=2. So a total of 6 shards. curl

Re: performance improvements on ip look up query

2013-01-09 Thread Lee Carroll
Hi Otis The cache was modest 4096 with a hit rate of 0.23 after a 24hr period. We doubled it and the hit rate went to 0.25. Our interpretation is ip is pretty much a cache busting value ? and cache size is not at play here. the q param is just startIpNum:[* TO 180891652]AND endIpNum:[180891652

Re: Restore hot backup

2013-01-09 Thread Upayavira
If you are in multicore mode, you can stop a core, move the backed up files into place, and restart/recreate the core. That would have the effect you desire. You may well be able to get away with swapping out the files and reloading the core, but the above would be safer. Best make sure you're

Re: SolrCloud - shard distribution

2013-01-09 Thread Mark Miller
I just tried this. I started 6 nodes with collection1 spread across two shards. Looked at the admin-cloud-graph view and everything looked right and green. Next, I copy and pasted your command and refreshed the graph cloud view. I see a new collection called consumer1 - all of it's nodes are

Re: DIH fails after processing roughly 10million records

2013-01-09 Thread Shawn Heisey
On 1/8/2013 11:19 PM, vijeshnair wrote: Yes Shawn, the batchSize is -1 only and I also have the mergeScheduler exactly same as you mentioned. When I had this problem in SOLR 3.4, I did an extensive googling and gathered much of the tweaks and tuning from different blogs and forums and

Re: Performance issue with group.ngroups=true

2013-01-09 Thread Jack Krupansky
group.ngroups=true is always going to be somewhat expensive, but in your case it seems more expensive than I would expect. You should check to see that you have enough Java JVM heap to hold more of the index and to avoid any excessive GCs. -- Jack Krupansky -Original Message- From:

RE: SolrJ DirectXmlRequest

2013-01-09 Thread Ryan Josal
I also don't know what's creating them. Maybe Solr, but also maybe Tomcat, maybe apache commons. I could change java.io.tmpdir to one with more space, but the problem is that many of the temp files end up permanent, so eventually it would still run out of space. I also considered setting the

Re: DIH fails after processing roughly 10million records

2013-01-09 Thread Shawn Heisey
On 1/9/2013 9:41 AM, Shawn Heisey wrote: With maxThreadCount at 1 and maxMergeCount at 6, I was able to complete full-import with no problems. All mysql (5.1.61) server-side timeouts are at their defaults - they don't show up in my.cnf and I haven't tweaked them anywhere else either. A full

Re: Is there faceting with Solr 4 spatial?

2013-01-09 Thread Smiley, David W.
Erick, Alex asked about Solr 4 spatial, and his use-case requires it because he's got multi-value spatial fields (multiple business office locations per document). So the Solr 3 spatial solution you posted won't cut it. Alex, You can do this in Solr 4.0. Use one facet.query per circle (I.e.

Re: Convert Complex Lucene Query to SolrQuery

2013-01-09 Thread Jagdish Nomula
Thanks Otis and Jack for your responses. We are trying to use embeddedsolr server with a solr query as follows: EmbeddedSolrServer server = new EmbeddedSolrServer(coreContainer, ); SolrQuery solrQuery = new SolrQuery(luceneQuery.toString()); // Here luceneQuery is a dismax query with

newbie questions about cache stats query perf

2013-01-09 Thread AJ Weber
Sorry, I did search for an answer, but didn't find an applicable one. I'm currently stuck on 1.4.1 (running in Tomcat 6 on 64bit Linux) for the time being... When I see stats like this: name: documentCache class: org.apache.solr.search.LRUCache version: 1.0 description:

RE: SolrCloud - shard distribution

2013-01-09 Thread James Thomas
Thanks for the quick reply Mark. I tried all kinds of variations, I could not get all 6 nodes to participate. So I downloaded the source code and took a look at OverseerCollectionProcessor.java I think my result is as-coded. Line 251 has this loop: for (int i = 1; i = numSlices; i++) {

Re: SOLR '0' Status: Communication Error

2013-01-09 Thread ddineshkumar
I forgot to mention.When I add documents to SOLR, I add it in batches of 50. Because my table has a lot of records, I have to do in batches due to memory constraints. The 'Communication error' occurs only for some batches. For other batches, documents get added properly. And also, I am including

Re: SOLR '0' Status: Communication Error

2013-01-09 Thread Shawn Heisey
On 1/9/2013 11:48 AM, ddineshkumar wrote: I forgot to mention.When I add documents to SOLR, I add it in batches of 50. Because my table has a lot of records, I have to do in batches due to memory constraints. The 'Communication error' occurs only for some batches. For other batches, documents

RE: SolrCloud - shard distribution

2013-01-09 Thread James Thomas
Oops, small copy-paste error. Had my i's and j's backwards. Should be: --- slice1, rep2 (i=1,j=2) == chooses node[1] --- slice2, rep1 (i=2,j=1) == chooses node[1] -Original Message- From: James Thomas [mailto:jtho...@camstar.com] Sent: Wednesday, January 09, 2013 1:39 PM To:

Re: defaultOperator in schema.xml

2013-01-09 Thread Rafał Kuć
Hello! You should set the q.op parameter in your request handler configuration in solrconfig.xml instead of using the default operator from schema.xml. -- Regards, Rafał Kuć Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - ElasticSearch I'm testing out Solr 4.0. I got the sample

Re: SolrJ DirectXmlRequest

2013-01-09 Thread Otis Gospodnetic
Hi Ryan, One typically uses a Solr client library to talk to Solr instead of sending raw XML. For example, if your application in written in Java then you would use SolrJ. Otis -- Solr ElasticSearch Support http://sematext.com/ On Wed, Jan 9, 2013 at 12:03 PM, Ryan Josal rjo...@rim.com

Re: Convert Complex Lucene Query to SolrQuery

2013-01-09 Thread Otis Gospodnetic
Aha. I think the problem here is the assumption that .toString() on Lucene query will give you a string that can then be re-parsed in the proper query and that is currently not the case. But if you start with the raw query like the one you would use with the Lucene QP, you should be fine. Can

Re: newbie questions about cache stats query perf

2013-01-09 Thread Otis Gospodnetic
Hi, In your Solr version there is a notion of Searcher being opened and reopened. Every time that happens those non-cumulative stats reset. The cumulative_ stats just don't refresh, so you have numbers from when the whole Solr started, not just from the last time Searcher opened. Your cache is

Re: unittest fail (sometimes) for float field search

2013-01-09 Thread Roman Chyla
Hi, It is not Eclipse related, neither codec related. There were two issues I had a wrong configuration of NumericConfig: new NumericConfig(4, NumberFormat.getNumberInstance(), NumericType.FLOAT)) I changed that to: new NumericConfig(4, NumberFormat.getNumberInstance(Locale.US),

Re: Pause and resume indexing on SolR 4 for backups

2013-01-09 Thread Paul Jungwirth
Are you sure a commit didn't happen between? Also, a background merge might have happened. As to using a backup, you are right, just stop solr, put the snapshot into index/data, and restart. This was mentioned before but seems not to have gotten any attention: can't you use the

Re: Pause and resume indexing on SolR 4 for backups

2013-01-09 Thread Otis Gospodnetic
Hi Paul, Hot backup is OK. There was a thread on this topic yesterday and the day before. But you should always try running from backup regardless of what anyone says here, because if you have to do that one day you want to know you verified it :) Otis -- Solr ElasticSearch Support

Re: Clean Up Aged Index Using DeletionPolicy

2013-01-09 Thread Otis Gospodnetic
Just to satisfy my curiosity - are you looking to have TTL for documents or for indices? The former: https://issues.apache.org/jira/browse/SOLR-3874 The latter: no issue that I know off, typically managed by the application. Otis -- Solr ElasticSearch Support http://sematext.com/ On Wed,

Re: SOLR '0' Status: Communication Error

2013-01-09 Thread ddineshkumar
Thanks Shawn. I tried increasing following timeouts in php: max_execution_time max_input_time default_socket_timeout But still I get 'Communication error'. Please let me know if I have to change any other timeout in php. -- View this message in context:

Re: Clean Up Aged Index Using DeletionPolicy

2013-01-09 Thread hyrax
Exactly what I want. For a simple scenario: Index a batch of documents 20 days ago and they are searchable via Solr. After say 20 days, you can't search them anymore because they are deleted automatically by Solr. Thanks, Hao -- View this message in context:

Re: Clean Up Aged Index Using DeletionPolicy

2013-01-09 Thread Otis Gospodnetic
Options: 1. Run delete by query every N hours/days to purge old docs 2. Create daily indices and drop them every H hours/days to get rid of all old docs The TTL support for 1. would probably be implemented with delete by query. The drawback of 1. compared to 2. is that you will pay the price

Re: Clean Up Aged Index Using DeletionPolicy

2013-01-09 Thread Walter Underwood
Solr does not delete anything automatically. Add a timestamp field when you index. Use delete by query to delete everything older than 20 days. wunder On Jan 9, 2013, at 12:44 PM, hyrax wrote: Exactly what I want. For a simple scenario: Index a batch of documents 20 days ago and they are

Re: Pause and resume indexing on SolR 4 for backups

2013-01-09 Thread Paul Jungwirth
Yes, I agree about making sure the backups actually work, whatever the approach. Thanks for your reply and all you've contributed to the Solr/Lucene community. The Lucene in Action book has been a huge help to me. Paul On Wed, Jan 9, 2013 at 12:16 PM, Otis Gospodnetic

performing a boolean query (OR) with a large number of terms

2013-01-09 Thread geeky2
hello, environment: solr 3.5 i have a requirement to perform a boolean query (like the example below) with a large number of terms. the number of terms could be 15 or possibly larger. after looking over several theads and the smiley book - i think i just have include the parens and string all

RE: SolrJ DirectXmlRequest

2013-01-09 Thread Ryan Josal
Thanks Otis, DirectXmlRequest is part of the SolrJ library, so I guess that means it is not commonly used. My use case is that I'm applying an XSLT to the raw XML on the client side, instead of leaving that up to the Solr master (although even if I applied the XSLT on the Solr server, I'd

Re: Pause and resume indexing on SolR 4 for backups

2013-01-09 Thread Upayavira
The point was as much about how to use a backup, as to how to make one in the first place. the replication handler can handle spitting out a backup, but there's no straightforward way to tell Solr to switch to another set of index files instead. You'd have to do clever stuff with the

SOLR/Velocity Test Cases

2013-01-09 Thread Marcos Mendez
Hi, I'm trying to write some tests based on SolrTestCaseJ4 that test using velocity in SOLR. I found VelocityResponseWriterTest.java, but this does not test that. In fact it has a todo to do what I want to do. Anyone have an example out there? I just need to check if velocity is loaded with

Re: How to run many MoreLikeThis request efficiently?

2013-01-09 Thread Yandong Yao
Any comments on this? Thanks very much in advance! 2013/1/9 Yandong Yao yydz...@gmail.com Hi Solr Guru, I have two set of documents in one SolrCore, each set has about 1M documents with different document type, say 'type1' and 'type2'. Many documents in first set are very similar with 1 or

what is difference between 4.1 and 5.x

2013-01-09 Thread solr-user
just curious as to what the difference is between 4.1 and 5.0 i.e. is 4.1 a maintenance branch for what is currently 4.0 or are they very different designs/architectures -- View this message in context: http://lucene.472066.n3.nabble.com/what-is-difference-between-4-1-and-5-x-tp4032064.html

Re: CoreAdmin STATUS performance

2013-01-09 Thread Yury Kats
On 1/9/2013 10:38 AM, Shahar Davidson wrote: Hi All, I have a client app that uses SolrJ and which requires to collect the names (and just the names) of all loaded cores. I have about 380 Solr Cores on a single Solr server (net indices size is about 220GB). Running the STATUS action

Re: what is difference between 4.1 and 5.x

2013-01-09 Thread Shawn Heisey
On 1/9/2013 5:11 PM, solr-user wrote: just curious as to what the difference is between 4.1 and 5.0 i.e. is 4.1 a maintenance branch for what is currently 4.0 or are they very different designs/architectures There are several code branches in the SVN repository. I'll talk about three of

Re: CoreAdmin STATUS performance

2013-01-09 Thread Shawn Heisey
On 1/9/2013 8:38 AM, Shahar Davidson wrote: I have a client app that uses SolrJ and which requires to collect the names (and just the names) of all loaded cores. I have about 380 Solr Cores on a single Solr server (net indices size is about 220GB). Running the STATUS action takes about 800ms

Re: SOLR/Velocity Test Cases

2013-01-09 Thread Erik Hatcher
Marcos - I just happen to be tinkering with VrW over the last few days (to get some big improvements across the board with it and the /browse UI into Solr 5.0, and maybe eventually 4.x too), so I whipped up such a test case just now. Here's the short and sweet version: public void

Schema Field Names i18n

2013-01-09 Thread Daryl Robbins
Anyone have experience with internationalizing the field names in the SOLR schema, so users in different languages can specify fields in their own language? My first thoughts would be to create a custom search component or query parser than would convert localized field names back to the

Re: How to run many MoreLikeThis request efficiently?

2013-01-09 Thread Otis Gospodnetic
Patience, young Yandong :) Multi-threading *in your application* is the way to go. Alternatively, one could write a custom SearchComponent that is called once and inside of which the whole work is done after just one call to it. This component could then write the output somewhere, like in a new

Re: SOLR/Velocity Test Cases

2013-01-09 Thread Erik Hatcher
And to add a little to this, since it looked ugly below, the $response.response.response.numFound thing is something I'm going to improve to make it leaner and cleaner to get at the actual result set and other response structures. $response is the actual SolrQueryResponse, and navigating that

Re: DIH fails after processing roughly 10million records

2013-01-09 Thread Lance Norskog
At this scale, your indexing job is prone to break in various ways. If you want this to be reliable, it should be able to restart in the middle of an upload, rather than starting over. On 01/08/2013 10:19 PM, vijeshnair wrote: Yes Shawn, the batchSize is -1 only and I also have the

Re: SolrCloud - Query performance degrades with multiple servers

2013-01-09 Thread sausarkar
Hi Yonik, Could you merger this feature with 4.0 branch, We tried to use 4.1 it did solve the CPU spike but we did get other issues. As we are very tight on schedule so it would very beneficial if you could merge this feature with 4.0 branch. Let me know. Thanks -- View this message in

Re: SolrCloud graph status is out of date

2013-01-09 Thread Mark Miller
It may be able to do that because it's forwarding requests to other nodes that are up? Would be good to dig into the logs to see if you can narrow in on the reason for the recovery_failed. - Mark On Jan 9, 2013, at 8:52 PM, Zeng Lames lezhi.z...@gmail.com wrote: Hi , we meet below

Re: SolrCloud - Query performance degrades with multiple servers

2013-01-09 Thread Shawn Heisey
On 1/9/2013 7:01 PM, sausarkar wrote: Hi Yonik, Could you merger this feature with 4.0 branch, We tried to use 4.1 it did solve the CPU spike but we did get other issues. As we are very tight on schedule so it would very beneficial if you could merge this feature with 4.0 branch. 4.1 *is* the

Setting up new SolrCloud - need some guidance

2013-01-09 Thread Shawn Heisey
I have a lot of experience with Solr, starting with 1.4.0 and currently running 3.5.0 in production. I am working on a 4.1 upgrade, but I have not touched SolrCloud at all. I now need to set up a brand new Solr deployment to replace a custom Lucene system, and due to the way the client

Re: Setting up new SolrCloud - need some guidance

2013-01-09 Thread Mark Miller
I'd put everything into one. You can upload different named sets of config files and point collections either to the same sets or different sets. You can really think about it the same way you would setting up a single node with multiple cores. The main difference is that it's easier to share

Re: How to run many MoreLikeThis request efficiently?

2013-01-09 Thread Yandong Yao
Hi Otis, Really appreciate your help on this!! Will go with multi-thread firstly, and then provide a custom component when performance is not good enough. Regards, Yandong 2013/1/10 Otis Gospodnetic otis.gospodne...@gmail.com Patience, young Yandong :) Multi-threading *in your application*

Re: SolrCloud graph status is out of date

2013-01-09 Thread Zeng Lames
thanks Mark. will further dig into the logs. there is another problem related. we have collections with 3 shards (2 nodes in one shard), the collection have about 1000 records in it. but unfortunately that after the leader is down, replica node failed to become the leader.the detail is : after