Re: Solr memory requirements?

2009-05-14 Thread vivek sar
Otis, We are not running master-slave configuration. We get very few searches(admin only) in a day so we didn't see the need of replication/snapshot. This problem is with one Solr instance managing 4 cores (each core 200 million records). Both indexing and searching is performed by the same Solr

Max no of solr cores supported and how to restrict a query to a particular core?

2009-05-14 Thread KK
I want to know the maximum no of cores supported by Solr. 1000s or may be millions all under one solr instance ? Also I want to know how to redirect a particular query to a particular core. Actually I'm querying solr from Ajax, so I think there must be some request parameter that says which core

Re: Max no of solr cores supported and how to restrict a query to a particular core?

2009-05-14 Thread Shishir Jain
http://wiki.apache.org/solr/CoreAdmin Best regards, Shishir On Thu, May 14, 2009 at 1:58 PM, KK dioxide.softw...@gmail.com wrote: I want to know the maximum no of cores supported by Solr. 1000s or may be millions all under one solr instance ? Also I want to know how to redirect a particular

Re: Max no of solr cores supported and how to restrict a query to a particular core?

2009-05-14 Thread Noble Paul നോബിള്‍ नोब्ळ्
there is no hard limit on the no:of cores. it is limited by your system's ability to open files and the resources. the queries are automatically sent to appropriate core if your url is htt://host:port/corename/select On Thu, May 14, 2009 at 1:58 PM, KK dioxide.softw...@gmail.com wrote: I want

Re: Delete documents from index with dataimport

2009-05-14 Thread Andrew McCombe
Hi Yes I'd like the document deleted from Solr and yes, there is a unique document id field in Solr. Regards Andrew Andrew 2009/5/13 Fergus McMenemie fer...@twig.me.uk: Hi Is it possible, through dataimport handler to remove an existing document from the Solr index? I import/update from my

UK Solr users meeting?

2009-05-14 Thread Colin Hammond
I was wondering if there is an interest in a UK (South East) solr user group meeting Please let me know if you are interested. I am happy to organize. Regards, Colin

Re: UK Solr users meeting?

2009-05-14 Thread Fergus McMenemie
I was wondering if there is an interest in a UK (South East) solr user group meeting Please let me know if you are interested. I am happy to organize. Regards, Colin Yes Very interested. I am in lincolnshire. -- === Fergus

Re: Delete documents from index with dataimport

2009-05-14 Thread Fergus McMenemie
Hi Yes I'd like the document deleted from Solr and yes, there is a unique document id field in Solr. I that case try the following. Create a field in the entity:- field column=$deleteDocById regex=^false$ replaceWith=${jc.id} sourceColName=active/

Re: Max no of solr cores supported and how to restrict a query to a particular core?

2009-05-14 Thread KK
Thank you very much. Got the point. One off the track question, can we automate the creation of new cores[it requires manually editing the solr.xml file as I know, and what about the location of core index directory, do we need to point that manually as well]. After going through the wiki what I

RE: Autocommit blocking adds? AutoCommit Speedup?

2009-05-14 Thread Gargate, Siddharth
Hi all, I am also facing the same issue where autocommit blocks all other requests. I having around 1,00,000 documents with average size of 100K each. It took more than 20 hours to index. I have currently set autocommit maxtime to 7 seconds, mergeFactor to 25. Do I need more configuration

Re: master/slave failure scenario

2009-05-14 Thread nk 11
Ok so the VIP will point to the new master. but what makes a slave promoted to a master? Only the fact that it will receive add/update requests? And I suppose that this hot promotion is possible only if the slave is convigured as master also... 2009/5/14 Noble Paul നോബിള്‍ नोब्ळ्

Re: Solr vs Sphinx

2009-05-14 Thread Michael McCandless
On Wed, May 13, 2009 at 12:33 PM, Grant Ingersoll gsing...@apache.org wrote: I've contacted others in the past who have done comparisons and after one round of emailing it was almost always clear that they didn't know what best practices are for any given product and thus were doing things

Re: Solr vs Sphinx

2009-05-14 Thread Andrey Klochkov
My most recent example of this is BooleanQuery's performance. It turns out, if you setAllowDocsOutOfOrder(true), it yields a sizable performance gain (27% on my most recent test) for OR queries. Mike, Can you please point me to some information concerning allowDocsOutOfOrder? What's this

Query syntax

2009-05-14 Thread Radha C.
Hello List, I need to search the multiple values from the same field. I am having the following syntax I am thinking of the first option. Can anyone tell me which one is correct syntax? Q=+title:=test +site_id:=22 3000676 566644 Q=+title:=test +site_id:=22 3000676 566644

Re: Autocommit blocking adds? AutoCommit Speedup?

2009-05-14 Thread Jack Godwin
20+ hours? I index 3 million records in 3 hours. Is your auto commit causing a snapshot? What do you have listed in the events. Jack On 5/14/09, Gargate, Siddharth sgarg...@ptc.com wrote: Hi all, I am also facing the same issue where autocommit blocks all other requests. I having

Date field

2009-05-14 Thread Jack Godwin
Does anyone know if there is still a bug in date fields? I'm having a problem boosting documents by date in solr 1.3 Thank, Jack -- Sent from my mobile device

Re: Query syntax

2009-05-14 Thread Shalin Shekhar Mangar
On Thu, May 14, 2009 at 5:20 PM, Radha C. cra...@ceiindia.com wrote: I need to search the multiple values from the same field. I am having the following syntax I am thinking of the first option. Can anyone tell me which one is correct syntax? Q=+title:=test +site_id:=22 3000676 566644

RE: Query syntax

2009-05-14 Thread Radha C.
Thanks for your reply. Yes by mistaken I added := in place of : . The title should match and the site_id should match any of these 23243455 , 245, 3457676 . _ From: Shalin Shekhar Mangar [mailto:shalinman...@gmail.com] Sent: Thursday, May 14, 2009 5:43 PM To:

Re: Query syntax

2009-05-14 Thread Shalin Shekhar Mangar
In that case, the following will work: q=+title:test +site_id:(23243455 245 3457676) On Thu, May 14, 2009 at 5:35 PM, Radha C. cra...@ceiindia.com wrote: Thanks for your reply. Yes by mistaken I added := in place of : . The title should match and the site_id should match any of these

Re: master/slave failure scenario

2009-05-14 Thread nk 11
oh, so the configuration must be manualy changed? Can't something be passed at (re)start time? 2009/5/14 Noble Paul നോബിള്‍ नोब्ळ् noble.p...@corp.aol.com On Thu, May 14, 2009 at 4:07 PM, nk 11 nick.cass...@gmail.com wrote: Ok so the VIP will point to the new master. but what makes a

Re: Solr vs Sphinx

2009-05-14 Thread Grant Ingersoll
Totally agree on optimizing out of the box experience, it's just never a one size fits all thing. And we have to be very careful about micro- benchmarks driving these settings. Currently, many of us use Wikipedia, but that's just one doc set and I'd venture to say most Solr users do not

Re: master/slave failure scenario

2009-05-14 Thread Noble Paul നോബിള്‍ नोब्ळ्
yeah there is a hack https://issues.apache.org/jira/browse/SOLR-1154?focusedCommentId=12708316page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12708316 On Thu, May 14, 2009 at 6:07 PM, nk 11 nick.cass...@gmail.com wrote: sorry for the mail. I wanted to hit reply :(

Re: Custom Servlet Filter, Where to put filter-mappings

2009-05-14 Thread Erik Hatcher
I like Grant's suggestion as the simplest solution. As for XML merging and XSLT, I really wouldn't want to go that route personally, but one solution that comes close to that is to template web.xml with some substitution tags and use Ant's ability to replace tokens. So we could put in

Re: Max no of solr cores supported and how to restrict a query to a particular core?

2009-05-14 Thread Noble Paul നോബിള്‍ नोब्ळ्
Solr already supports this . please refer this http://wiki.apache.org/solr/CoreAdmin#head-7ca1b98a9df8b8ca0dcfbfc49940ed5ac98c4a08 ensure that your solr.xml is persistent http://wiki.apache.org/solr/CoreAdmin#head-7508c24c6e2dadad2dfea39b2fba045062481da8 On Thu, May 14, 2009 at 3:43 PM, KK

Re: Solr vs Sphinx

2009-05-14 Thread Marvin Humphrey
On Thu, May 14, 2009 at 06:47:01AM -0400, Michael McCandless wrote: While I agree, one should properly match tune all apps they are testing (for a fair comparison), we in turn must set out-of-the-box defaults (in Lucene and Solr) that get you as close to the best practices as possible. So,

Re: master/slave failure scenario

2009-05-14 Thread nk 11
wow! that was just a couple of days old! thanks as lot! 2009/5/14 Noble Paul നോബിള്‍ नोब्ळ् noble.p...@corp.aol.com yeah there is a hack

RE: Autocommit blocking adds? AutoCommit Speedup?

2009-05-14 Thread jayson.minard
Siddharth, The settings you have in your solrconfig for ramBufferSizeMB and maxBufferedDocs control how much memory may be used during indexing besides any overhead with the documents being in-flight at a given moment (deserialized into memory but not yet handed to lucene). There are streaming

Re: Autocommit blocking adds? AutoCommit Speedup?

2009-05-14 Thread jayson.minard
Indexing speed comes down to a lot of factors. The settings as talked about above, VM settings, the size of the documents, how many are sent at a time, how active you can keep the indexer (i.e. one thread sending documents lets the indexer relax whereas N threads keeps pressure on the indexer),

Re: Solr vs Sphinx

2009-05-14 Thread Michael McCandless
On Thu, May 14, 2009 at 6:51 AM, Andrey Klochkov akloch...@griddynamics.com wrote: Can you please point me to some information concerning allowDocsOutOfOrder? What's this at all? There is this cryptic static setter (in Lucene): BooleanQuery.setAllowDocsOutOfOrder(boolean) It defaults to

Additional metadata when using Solr Cell

2009-05-14 Thread rossputin
Hi. I am indexing a PDF document with the ExtractingRequestHandler. My curl post has a URL like: ../solr/update/extract?ext.idx.attr=trueext.def.fl=textext.literal.id=123ext.literal.author=Somebody Sure enough I see in the server logs:

Re: Max no of solr cores supported and how to restrict a query to a particular core?

2009-05-14 Thread KK
Thank you very much. LOL, Its in the same wiki I was told to go through. I've a question regarding creating ofsolr cores on the fly. The wiki says, .Creates a new core and register it. If persistence is enabled (persist=true), the configuration for this new core will be saved in 'solr.xml'.

Re: Custom Servlet Filter, Where to put filter-mappings

2009-05-14 Thread Jacob Singh
I found a very elegant (I think) solution to this. I'll post a patch today or tomorrow. Best, -Jacob On Thu, May 14, 2009 at 6:22 PM, Erik Hatcher e...@ehatchersolutions.com wrote: I like Grant's suggestion as the simplest solution. As for XML merging and XSLT, I really wouldn't want to go

Re: Additional metadata when using Solr Cell

2009-05-14 Thread Grant Ingersoll
what does /admin/luke show for fields and terms in the fields? On May 14, 2009, at 10:03 AM, rossputin wrote: Hi. I am indexing a PDF document with the ExtractingRequestHandler. My curl post has a URL like: ../solr/update/extract? ext .idx .attr

Re: Additional metadata when using Solr Cell

2009-05-14 Thread rossputin
There is no reference to the author field I am trying to set.. I am using the latest nightly download. -- Ross Grant Ingersoll-6 wrote: what does /admin/luke show for fields and terms in the fields? On May 14, 2009, at 10:03 AM, rossputin wrote: Hi. I am indexing a PDF document

Re: Additional metadata when using Solr Cell

2009-05-14 Thread Grant Ingersoll
Do you have an author field in your schema? On May 14, 2009, at 10:31 AM, rossputin wrote: There is no reference to the author field I am trying to set.. I am using the latest nightly download. -- Ross Grant Ingersoll-6 wrote: what does /admin/luke show for fields and terms in the

AW: AW: Geographical search based on latitude and longitude

2009-05-14 Thread Norman Leutner
Hi Grant, thanks for the reply. Is the logic for a function query that calculates distances that Yonik mentioned (gdist(position,101.2,234.3)) already implemented? This could be either very inaccurate or load intense. If the logic isn't done until now maybe I can prepare it. Norman

Re: Solr vs Sphinx

2009-05-14 Thread gdeconto
Yonik Seeley-2 wrote: It's probably the case that every search engine out there is faster than Solr at one thing or another, and that Solr is faster or better at some other things. I prefer to spend my time improving Solr rather than engage in benchmarking wars... and Solr 1.4 will

CommonsHttpSolrServer vs EmbeddedSolrServer

2009-05-14 Thread sachin78
What is the difference between EmbeddedSolrServer and CommonsHttpSolrServer. Which is the preferred server to use? In some blog i read that EmbeddedSolrServer is 50% faster than CommonsHttpSolrServer,then why do we need to use CommonsHttpSolrServer. Can anyone please guide me the right

Re: Replication master+slave

2009-05-14 Thread Bryan Talbot
https://issues.apache.org/jira/browse/SOLR-1167 -Bryan On May 13, 2009, at May 13, 7:20 PM, Otis Gospodnetic wrote: Bryan, maybe it's time to stick this in JIRA? http://wiki.apache.org/solr/HowToContribute Thanks, Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch

Powered by Solr

2009-05-14 Thread Terence Gannon
I was intending to make an entry to the 'Powered by Solr' page, so I created a Wiki account and logged in. When I go to that page, it shows it as being 'immutable', which I take as meaning I can't edit it. Is there someone I can send the information to who can do the edit? Or perhaps there is

Re: Powered by Solr

2009-05-14 Thread Yonik Seeley
On Thu, May 14, 2009 at 1:54 PM, Terence Gannon porfa...@gmail.com wrote: I was intending to make an entry to the 'Powered by Solr' page, so I created a Wiki account and logged in.  When I go to that page, it shows it as being 'immutable', which I take as meaning I can't edit it. Did you try

Re: Solr vs Sphinx

2009-05-14 Thread Michael McCandless
On Thu, May 14, 2009 at 9:07 AM, Marvin Humphrey mar...@rectangular.com wrote: Richard Feynman: ...if you're doing an experiment, you should report everything that you think might make it invalid - not only what you think is right about it: other causes that could possibly explain

Re: Solr memory requirements?

2009-05-14 Thread vivek sar
I don't know if field type has any impact on the memory usage - does it? Our use cases require complete matches, thus there is no need of any analysis in most cases - does it matter in terms of memory usage? Also, is there any default caching used by Solr if I comment out all the caches under

Re: Powered by Solr

2009-05-14 Thread Terence Gannon
Did you try hitting refresh on your browser after you logged in? Wow, I really should have known that...thank you for your patient reply, Yonik. Regards...Terence

replication of lucene-write.lock file

2009-05-14 Thread Bryan Talbot
When using solr 1.4 replication, I see that the lucene-write.lock file is being replicated to slaves. I'm importing data from a db every 5 minutes using cron to trigger a DIH delta-import. Replication polls every 60 seconds and the master is configured to take a snapshot

Re: CommonsHttpSolrServer vs EmbeddedSolrServer

2009-05-14 Thread Eric Pugh
CommonsHttpSolrServer is how you access Solr from a Java client via HTTP. You can connect to a Solr running anywhere EmbeddedSolrServer starts up Solr internally, and connects directly, all in a single JVM... Embedded may be faster, the jury is out, but you have to have your Solr

Re: CommonsHttpSolrServer vs EmbeddedSolrServer

2009-05-14 Thread Ryan McKinley
right -- which one you pick will depend more on your runtime environment then anything else. If you need to hit a server (on a different machine) CommonsHttpSolrServer is your only option. If you are running an embedded application -- where your custom code lives in the same JVM as solr

Re: Solr vs Sphinx

2009-05-14 Thread Mike Klaas
On 14-May-09, at 9:46 AM, gdeconto wrote: Solr is very fast even with 1.3 and the developers have done an incredible job. However, maybe the next Solr improvement should be the creation of a configuration manager and/or automated tuning tool. I know that optimizing Solr performance can

Re: Solr memory requirements?

2009-05-14 Thread vivek sar
Some update on this issue, 1) I attached jconsole to my app and monitored the memory usage. During indexing the memory usage goes up and down, which I think is normal. The memory remains around the min heap size (4 G) for indexing, but as soon as I run a search the tenured heap usage jumps up to

Re: Solr vs Sphinx

2009-05-14 Thread Mark Miller
Michael McCandless wrote: So why haven't we enabled this by default, already? Why isn't Lucene done already :) - Mark

Search Query Questions

2009-05-14 Thread Chris Miller
I have two questions: 1) How do I search for ALL items? For example, I provide a sort query parameter of updated and a rows query parameter of 10 to limit the query results. I still have to provide a search query, of course. What if I want to provide a list of ALL results that match this?

Re: Search Query Questions

2009-05-14 Thread Chris Miller
Oh, one more question 3) Is there a way to effectively do a GROUP BY? For example, if I have a document that has a photoID attached to it, is there a way to return a set of results that does not duplicate the photoID field? Thanks, Chris Miller ServerMotion www.servermotion.com On

Re: Additional metadata when using Solr Cell

2009-05-14 Thread Mark Miller
rossputin wrote: Hi. I am indexing a PDF document with the ExtractingRequestHandler. My curl post has a URL like: ../solr/update/extract?ext.idx.attr=trueext.def.fl=textext.literal.id=123ext.literal.author=Somebody Sure enough I see in the server logs:

Re: Solr memory requirements?

2009-05-14 Thread Mark Miller
800 million docs is on the high side for modern hardware. If even one field has norms on, your talking almost 800 MB right there. And then if another Searcher is brought up well the old one is serving (which happens when you update)? Doubled. Your best bet is to distribute across a couple

Re: Search Query Questions

2009-05-14 Thread Matt Weber
I think you will want to look at the Field Collapsing patch for this. http://issues.apache.org/jira/browse/SOLR-236 . Thanks, Matt Weber eSr Technologies http://www.esr-technologies.com On May 14, 2009, at 5:52 PM, Chris Miller wrote: Oh, one more question 3) Is there a way to

Re: Solr memory requirements?

2009-05-14 Thread vivek sar
Thanks Mark. I checked all the items you mentioned, 1) I've omitnorms=true for all my indexed fields (stored only fields I guess doesn't matter) 2) I've tried commenting out all caches in the solrconfig.xml, but that doesn't help much 3) I've tried commenting out the first and new searcher