Re: error opening index solr 4.0 with lukeall-4.0.0-ALPHA.jar

2012-11-19 Thread Bernd Fehling
I just downloaded, compiled and opened an optimized solr 4.0 index in read only without problems. Could browse through the docs, search with different analyzers, ... Looks good. Am 19.11.2012 08:49, schrieb Toke Eskildsen: On Mon, 2012-11-19 at 08:10 +0100, Bernd Fehling wrote: I think there

RE: Reduce QueryComponent prepare time

2012-11-19 Thread Markus Jelsma
I'd also like to know which parts of the entire query constitute the prepare time and if it would matter significantly if we extend the edismax plugin and hardcode the parameters we pass into (reusable) objects. Thanks, Markus -Original message- From:Markus Jelsma

configuring data source in apache tomcat

2012-11-19 Thread Leena Jawale
Hi, I have configured apche solr with tomcat for that I have deployed .war file in tomcat. I have created the solr home directory at C:\solr. And after starting tomcat solr.war file get extracted and a folder is get created in webapps. In that in WEB-INF/web.xml I had written env-entry

Re: Reduce QueryComponent prepare time

2012-11-19 Thread Mikhail Khludnev
Markus, It's hard to suggest anything until you provide a profiler snapshot which says what it spends time in prepare for. As far as I know in prepare it parses queries e.g. we have a really heavy query parsers, but I don't think it's really common. On Mon, Nov 19, 2012 at 3:08 PM, Markus

CloudSolrServer or load-balancer for indexing

2012-11-19 Thread Marcin Rzewucki
Hi, As far as I know CloudSolrServer is recommended to be used for indexing to SolrCloud. I wonder what are advantages of this approach over external load-balancer ? Let's say I have 4 nodes SolrCloud (2 shards + replicas) + 1 server running ZooKeeper. I can use CloudSolrServer for indexing or

Re: SolrCloud Error after leader restarts

2012-11-19 Thread Mark Miller
Your using ram dir? Sent from my iPhone On Nov 19, 2012, at 1:21 AM, deniz denizdurmu...@gmail.com wrote: Hello, for test purposes, I am running two zookeepers on ports 2181 and 2182. and i have two solr instances running on different machines... For the one which is running on my local

Re: CloudSolrServer or load-balancer for indexing

2012-11-19 Thread Mark Miller
Nodes stop accepting updates if they cannot talk to Zookeeper, so the external load balancer is no advantage there. CloudSolrServer will be smart about knowing who the leaders are, eventually will do hashing, will auto add/remove nodes from rotation based on the cluster state in Zookeeper, and

SolrCloud and exernal file fields

2012-11-19 Thread Simone Gianni
Hi all, I'm planning to move a quite big Solr index to SolrCloud. However, in this index, an external file field is used for popularity ranking. Does SolrCloud supports external file fields? How does it cope with sharding and replication? Where should the external file be placed now that the

Re: Custom Solr indexer/searcher

2012-11-19 Thread Smiley, David W.
FWIW I helped someone a few days ago about a similar problem and similarly advised modifying SpatialPrefixTree: http://lucene.472066.n3.nabble.com/PointType-multivalued-query-tt4020445.html IMO GeoHashField should be deprecated because it ads no value. ~ David On Nov 16, 2012, at 1:49 PM,

solr cloud shards and servers issue

2012-11-19 Thread joe.cohe...@gmail.com
Hi I have the following scenario: I have 1 collection across 10 servers. Num of shards: 10. Each server has 2 solr instances running. replication is 2. I want to move one of the instances to another server. meaning, kill the solr process in server X and start a new solr process in server Y

How do I best detect when my DIH load is done?

2012-11-19 Thread Andy Lester
A little while back, I needed a way to tell if my DIH load was done, so I made up a little Ruby program to query /dih?command=status . The program is here: http://petdance.com/2012/07/a-little-ruby-program-to-monitor-solr-dih-imports/ Is this the best way to do it? Is there some other tool or

Re: solr cloud shards and servers issue

2012-11-19 Thread Mark Miller
On Nov 19, 2012, at 11:24 AM, joe.cohe...@gmail.com wrote: Hi I have the following scenario: I have 1 collection across 10 servers. Num of shards: 10. Each server has 2 solr instances running. replication is 2. I want to move one of the instances to another server. meaning, kill the solr

Order by hl.snippets count

2012-11-19 Thread Gabriel Croitoru
Hello, I'm using Solr 1.3 with http://wiki.apache.org/solr/HighlightingParameters options. The client just asked us to change the order from the default score to the number of hl.snippets per document. It's this posibble from Solr configuration? (without implementing a custom scoring

Re: solr cloud shards and servers issue

2012-11-19 Thread joe.cohe...@gmail.com
How can I unload a solrCore after i killed the running process? Mark Miller-3 wrote On Nov 19, 2012, at 11:24 AM, joe.cohen.m@ wrote: Hi I have the following scenario: I have 1 collection across 10 servers. Num of shards: 10. Each server has 2 solr instances running. replication is

RE: inconsistent number of results returned in solr cloud

2012-11-19 Thread Buttler, David
Answers inline below -Original Message- From: Erick Erickson [mailto:erickerick...@gmail.com] Sent: Saturday, November 17, 2012 6:40 AM To: solr-user@lucene.apache.org Subject: Re: inconsistent number of results returned in solr cloud Hmmm, first an aside. If by commit after every batch

RE: Architecture Question

2012-11-19 Thread Buttler, David
If you just want to store the data, you can dump it into HDFS sequence files. While HBase is really nice if you want to process and serve data real-time, it adds overhead to use it as pure storage. Dave -Original Message- From: Cool Techi [mailto:cooltec...@outlook.com] Sent: Friday,

RE: How do I best detect when my DIH load is done?

2012-11-19 Thread Dyer, James
Andy, I use an approach similar to yours. There may be something better, however. You might be able to write an onImportEnd listener to tell you when it ends. See http://wiki.apache.org/solr/DataImportHandler#EventListeners for a little documentation See also

Search using the result returned from the spell checking component

2012-11-19 Thread Roni
Hi, I've successfully configured the spell check component and it works well. I couldn't find an answer to my question so any help would be much appreciated: Can i send a single request to Solr, and make it so that if any part of the query was misspelled, than the search would be performed

RE: Search using the result returned from the spell checking component

2012-11-19 Thread Dyer, James
What you want isn't supported. You always will need to issue that second request. This would be a nice feature to add though. James Dyer E-Commerce Systems Ingram Content Group (615) 213-4311 -Original Message- From: Roni [mailto:r...@socialarray.com] Sent: Monday, November 19, 2012

RE: Search using the result returned from the spell checking component

2012-11-19 Thread Roni
Thank you. I was wondering - what if a make a first request, and ask it to return only 1 result - will it still return the spell suggestions while avoiding the overhead of returning all relevant results? Than I could make a second request to get all the results i need. Would that work? --

Cacti monitoring of Solr and Tomcat

2012-11-19 Thread Andy Lester
Is anyone using Cacti to track trends over time in Solr and Tomcat metrics? We have Nagios set up for alerts, but want to track trends over time. I've found a couple of examples online, but none have worked completely for me. I'm looking at this one next:

Re: Search using the result returned from the spell checking component

2012-11-19 Thread Walter Underwood
You can even request zero rows. That will still return the number of matches. --wunder On Nov 19, 2012, at 11:12 AM, Roni wrote: Thank you. I was wondering - what if a make a first request, and ask it to return only 1 result - will it still return the spell suggestions while avoiding the

Re: Search using the result returned from the spell checking component

2012-11-19 Thread Roni
And performance-wise: is asking for 0 rows the same as asking for 100 rows? On Mon, Nov 19, 2012 at 9:22 PM, Walter Underwood [via Lucene] ml-node+s472066n4021143...@n3.nabble.com wrote: You can even request zero rows. That will still return the number of matches. --wunder On Nov 19, 2012,

Re: How do I best detect when my DIH load is done?

2012-11-19 Thread Shawn Heisey
On 11/19/2012 11:52 AM, Dyer, James wrote: Andy, I use an approach similar to yours. There may be something better, however. You might be able to write an onImportEnd listener to tell you when it ends. See http://wiki.apache.org/solr/DataImportHandler#EventListeners for a little

Can Solr v1.4 and v4.0 co-exist in Tomcat?

2012-11-19 Thread kfdroid
I have an existing v1.4 implementation of Solr that supports 2 lines of business. For a third line of business the need to do Geo searching requires using Solr 4.0. I'd like to minimize the impact to the existing lines of business (let them upgrade at their own pace), however I want to share

Re: How do I best detect when my DIH load is done?

2012-11-19 Thread geeky2
Hello Andy, i had a similar question on this some time ago. http://lucene.472066.n3.nabble.com/possible-status-codes-from-solr-during-a-DIH-data-import-process-td3987110.html#a3987123

Re: Cacti monitoring of Solr and Tomcat

2012-11-19 Thread Otis Gospodnetic
Hi Andy, My favourite topic ;) See my sig below for SPM for Solr. At my last company we used Cacti but it felt very 1990s almost. Some ppl use zabbix, some graphite, some newrelic, some SPM, some nothing! Otis -- Solr Performance Monitoring - http://sematext.com/spm On Nov 19, 2012 2:18 PM,

RE: How do I best detect when my DIH load is done?

2012-11-19 Thread geeky2
James, was it you (cannot remember) that replied to one of my queries on this subject and mentioned that there was consideration being given to cleaning up the response codes to remove ambiguity? -- View this message in context:

Inserting many documents and update relations

2012-11-19 Thread uwe72
Hi there, i have a principal question. We have arround 5 million lucene documents. At the beginning we have arround 4000 XML-files which we transform to SolrInputDocuemnts by using solrj and adding them to the index. A document is also related to other documents, so while adding a document we

RE: How do I best detect when my DIH load is done?

2012-11-19 Thread Dyer, James
I'm not sure. But there are at least a few jira issues open with differing ideas on how to improve this. For instance, SOLR-1554 SOLR-2728 SOLR-2729 James Dyer E-Commerce Systems Ingram Content Group (615) 213-4311 -Original Message- From: geeky2 [mailto:gee...@hotmail.com] Sent:

Re: Can Solr v1.4 and v4.0 co-exist in Tomcat?

2012-11-19 Thread James Jory
Hi Ken- We've been running 1.3 and 4.0 as separate web apps within the same Tomcat instance for the last 3 weeks with no issues. The only challenge for us was refactoring our app client code to use SolrJ 4.0 to access both the the 1.3 and 4.0 backends. The calls to the 1.3 backend use the XML

Re: Per user document exclusions

2012-11-19 Thread SUJIT PAL
Hi Christian, Since customization is not a problem in your case, how about writing out the userId and excluded document ids to the database when it is excluded, and then for each query from the user (possibly identified by a userid parameter), lookup the database by userid, construct a NOT

Re: Solr4.0 / SolrCloud queries

2012-11-19 Thread shreejay
Hi all , I have managed to successfully index around 6 million documents, but while indexing (and even now after the indexing has stopped), I am running into a bunch of errors. The most common error I see is / null:org.apache.solr.common.SolrException:

Re: Per user document exclusions

2012-11-19 Thread Otis Gospodnetic
Hi Christian, Since you didn't explicitly mention it, I'm not sure if you are aware of it - ManifoldCF has ACL support built in. This may be what you are after. Otis -- Solr Performance Monitoring - http://sematext.com/spm/index.html Search Analytics -

Re: Best way to retrieve 20 specific documents

2012-11-19 Thread Tomás Fernández Löbbe
If you are in Solr 4 you could use realtime get and list the ids that you need. For example: http://host:port/solr/mycore/get?ids=my_id_1,my_id_2... See http://lucidworks.lucidimagination.com/display/solr/RealTime+Get Tomás On Mon, Nov 19, 2012 at 5:27 PM, Otis Gospodnetic

Re: Cacti monitoring of Solr and Tomcat

2012-11-19 Thread Andy Lester
On Nov 19, 2012, at 1:46 PM, Otis Gospodnetic otis.gospodne...@gmail.com wrote: My favourite topic ;) See my sig below for SPM for Solr. At my last company we used Cacti but it felt very 1990s almost. Some ppl use zabbix, some graphite, some newrelic, some SPM, some nothing! SPM looks

Re: Solr Delta Import Handler not working

2012-11-19 Thread Lance Norskog
| dataSource=null I think this should not be here. The datasource should default to the dataSource listing. And 'rootEntity=true' should be in the XPathEntityProcessor block, because you are adding each file as one document. - Original Message - | From: Spadez

Re: Cacti monitoring of Solr and Tomcat

2012-11-19 Thread Walter Underwood
We (Chegg) are using New Relic, even for the dev systems. It is pretty good, but only reports averages, when we need median and 90th percentile. Our next step is putting something together with the Metrics server from Coda Hale (http://metrics.codahale.com/) and Graphite

Odd behaviour for case insensitive searches

2012-11-19 Thread shemszot
Hello Everyone, I've been having issues with odd SOLR behavior when searching for case insensitive data. Let's take a vanilla SOLR config (from the example). Then I uploaded the default solr.xml document with a slight modification to the field with name 'name'. I added Thomas NOSQL. add doc

Re: solr cloud shards and servers issue

2012-11-19 Thread Otis Gospodnetic
Joe, Can you remove it from the config and have it gone when you restart Solr? Or restart Solr and unload as described on http://wiki.apache.org/solr/CoreAdmin ? Otis -- Performance Monitoring - http://sematext.com/spm/index.html Search Analytics - http://sematext.com/search-analytics/index.html

Re: CloudSolrServer or load-balancer for indexing

2012-11-19 Thread Marcin Rzewucki
OK, got it. Thanks. On 19 November 2012 15:00, Mark Miller markrmil...@gmail.com wrote: Nodes stop accepting updates if they cannot talk to Zookeeper, so the external load balancer is no advantage there. CloudSolrServer will be smart about knowing who the leaders are, eventually will do

Re: CloudSolrServer or load-balancer for indexing

2012-11-19 Thread Upayavira
A single zookeeper node could be a single point of failure. It is recommended that you have at least one three zookeeper nodes running as an ensemble. Zookeeper has a simple rule - over half of your nodes must be available to achieve quorum and thus be functioning. This is to avoid 'split-brain'.

Re: Cacti monitoring of Solr and Tomcat

2012-11-19 Thread Chris Hostetter
: Is anyone using Cacti to track trends over time in Solr and Tomcat : metrics? We have Nagios set up for alerts, but want to track trends : over time. A key thing to remember is that all of the stats you can get from solr via HTTP are also available via JMX...

Re: solr cloud shards and servers issue

2012-11-19 Thread Tomás Fernández Löbbe
Maybe it would be better if Solr checked the live nodes and not all the existing nodes in zk. If a server dies and you need to start a new one, it would go straight to the correct shard without one needing to specify it manually. Of course, the problem could be if a server goes down for a minute

Re: Order by hl.snippets count

2012-11-19 Thread Koji Sekiguchi
(12/11/20 1:50), Gabriel Croitoru wrote: Hello, I'm using Solr 1.3 with http://wiki.apache.org/solr/HighlightingParameters options. The client just asked us to change the order from the default score to the number of hl.snippets per document. It's this posibble from Solr configuration?

Re: Best way to retrieve 20 specific documents

2012-11-19 Thread Shawn Heisey
On 11/19/2012 1:49 PM, Dotan Cohen wrote: On Mon, Nov 19, 2012 at 10:27 PM, Otis Gospodnetic otis.gospodne...@gmail.com wrote: Hi, How about id1 OR id2 OR id3? :) Thank, Otis. This was my first inclination (id:123 OR 456), but it didn't work when I tried. At your instigation I tried then

Re: Execute an independent query from the main query

2012-11-19 Thread Indika Tantrigoda
Hi Otis, Yes, that seems like one solution, however I have multiple opening and closing hours, within the same day. Therefore it might become somewhat complicated to manage the index. For now I shifted the business logic to the client and a second query is made to get the additional data. Thanks

Re: Preventing accepting queries while custom QueryComponent starts up?

2012-11-19 Thread Chris Hostetter
: I have several custom QueryComponents that have high one-time startup costs : (hashing things in the index, caching things from a RDBMS, etc...) you need to provide more details about how your custom components work -- in particular: where in teh lifecycle of your components is this

Re: Best way to retrieve 20 specific documents

2012-11-19 Thread Upayavira
In fact, you shouldn't need OR: id:(123 456 789) will default to OR. Upayavira On Mon, Nov 19, 2012, at 10:45 PM, Shawn Heisey wrote: On 11/19/2012 1:49 PM, Dotan Cohen wrote: On Mon, Nov 19, 2012 at 10:27 PM, Otis Gospodnetic otis.gospodne...@gmail.com wrote: Hi, How about id1 OR

All-wildcard query performance

2012-11-19 Thread Aleksey Vorona
Hi, Our application sometimes generates queries with one of the constraints: field:[* TO *] I expected this query performance to be the same as if we omitted the field constraint completely. However, I see the performance of the two queries to differ drastically (3ms without all-wildcard

Re: SolrCloud Error after leader restarts

2012-11-19 Thread Mark Miller
It's generally not a good choice to use ram directory. 4x solrcloud does not work with it no - 5x does, but in any case, ram dir is not persistent. So when you restart Solr you will lose the data. MMap is generally the right dir to use. - Mark On Nov 19, 2012, at 6:52 PM, deniz

solr4 MULTIPOLYGON search syntax

2012-11-19 Thread jend
Does anybody have any info on how to property construct a multipolygon search? Im very interested in Polygon (search all documents within a shape) Multipolygon (search all documents within 2+ shapes) Multipolygon (search all documents with 2+ shapes but not within an area within a shape - if you

Re: All-wildcard query performance

2012-11-19 Thread Shawn Heisey
Hi, Our application sometimes generates queries with one of the constraints: field:[* TO *] I expected this query performance to be the same as if we omitted the field constraint completely. However, I see the performance of the two queries to differ drastically (3ms without

Re: More Like this without a document?

2012-11-19 Thread Chris Hostetter
: If I want to use MoreLikeThis algorithm I need to add this documents in the : index? The MoreLikeThis will work with soft commits? Is there a solution to : do a MoreLikeThis without adding the document in the index? you can feed the MoreLikeThisHandler a ContentStream (ie: POST data, or file

Re: SolrCloud Error after leader restarts

2012-11-19 Thread deniz
i know facts about ramdirectory actually.. just running some perf tests on our dev env right now.. so in case i use ramdir with 5x cloud, it will still not do the recovery? i mean it will not get the data from the leader and fill its ramdir again? - Zeki ama calismiyor... Calissa yapar...

Re: is it possible to save the search query?

2012-11-19 Thread Romita Saha
Hi, Thanks for your guidance. I am unable to figure out what is a doc ID and how can i collect all the doc IDs. Thanks and regards, Romita Saha From: Otis Gospodnetic otis.gospodne...@gmail.com To: solr-user@lucene.apache.org, Date: 11/09/2012 12:33 AM Subject:Re: is it

Re: SolrCloud Error after leader restarts

2012-11-19 Thread Mark Miller
On Nov 19, 2012, at 9:11 PM, deniz denizdurmu...@gmail.com wrote: so in case i use ramdir with 5x cloud, it will still not do the recovery? i mean it will not get the data from the leader and fill its ramdir again? Yes, in 5x RAM directory should be able to recover. - Mark

Ranking by sorting score and rankingField better or by product(score, rankingField)?

2012-11-19 Thread Floyd Wu
Hi there, I have a field(which is externalFileField, called rankingField) and that value(type=float) is calculated by client app. For the solr original scoring model, affect boost value will result different ranking. So I think product(score,rankingField) may equivalent to solr scoring model.

Re: SolrCloud Error after leader restarts

2012-11-19 Thread deniz
Mark Miller-3 wrote On Nov 19, 2012, at 9:11 PM, deniz lt; denizdurmus87@ gt; wrote: so in case i use ramdir with 5x cloud, it will still not do the recovery? i mean it will not get the data from the leader and fill its ramdir again? Yes, in 5x RAM directory should be able to recover.

Re: Ranking by sorting score and rankingField better or by product(score, rankingField)?

2012-11-19 Thread Otis Gospodnetic
Hi, 3. yes, you can sort by function - http://search-lucene.com/?q=solr+sort+by+function 2. this will sort by score only when there is a tie in ranking (two docs have the same rank value) 1. the reverse of 2. Otis -- Performance Monitoring - http://sematext.com/spm/index.html Search Analytics -

Re: Custom ranking solutions?

2012-11-19 Thread Otis Gospodnetic
Hi Floyd, Use debugQuery=true and let's see it.:) Otis -- Performance Monitoring - http://sematext.com/spm/index.html Search Analytics - http://sematext.com/search-analytics/index.html On Mon, Nov 19, 2012 at 9:29 PM, Floyd Wu floyd...@gmail.com wrote: Hi there, Before ExternalFielField

Re: is it possible to save the search query?

2012-11-19 Thread Otis Gospodnetic
Hi, Document ID would be a field in your document. A unique field that you specify when indexing. You can collect it by telling Solr to return it in the search results by including it in the fl= parameter. Otis -- Performance Monitoring - http://sematext.com/spm/index.html Search Analytics

Re: Best way to retrieve 20 specific documents

2012-11-19 Thread Otis Gospodnetic
I wanted to be explicit for the OP. Vut wouldn't that depend on mm if you are using (e)dismax? Otis -- Performance Monitoring - http://sematext.com/spm/index.html Search Analytics - http://sematext.com/search-analytics/index.html On Mon, Nov 19, 2012 at 6:37 PM, Upayavira u...@odoko.co.uk

Re: Ranking by sorting score and rankingField better or by product(score, rankingField)?

2012-11-19 Thread Floyd Wu
Thanks Otis, But the sort=product(score, rankingField) is not working in my test. What probably wrong? Floyd 2012/11/20 Otis Gospodnetic otis.gospodne...@gmail.com Hi, 3. yes, you can sort by function - http://search-lucene.com/?q=solr+sort+by+function 2. this will sort by score only

Re: Ranking by sorting score and rankingField better or by product(score, rankingField)?

2012-11-19 Thread Otis Gospodnetic
Hi, Do you see any errors? Which version of Solr? What does debugQuery=true say? Are you sure your file with ranks is being used? (remove it, put some junk in it, see if that gives an error) Otis -- Performance Monitoring - http://sematext.com/spm/index.html Search Analytics -

Re: Custom ranking solutions?

2012-11-19 Thread Floyd Wu
HI Otis, The debug information as following, seems there is no product() process . lst name=debug str name=rawquerystring_l_all:測試/str str name=querystring_l_all:測試/str str name=parsedqueryPhraseQuery(_l_all:測 試)/str str name=parsedquery_toString_l_all:測 試/str lst name=explain str name=222

Re: Ranking by sorting score and rankingField better or by product(score, rankingField)?

2012-11-19 Thread Floyd Wu
Hi Otis, There is no error in console nor in log file. I'm using Solr-4.0. The External file name is external_rankingField.txt and exist is directory C:\solr-4.0.0\example\solr\collection1\data\external_rankingField.txt External file should work as well because when I issue query

Weird Behaviour on Solr 5x (SolrCloud)

2012-11-19 Thread deniz
Hi all, after Mark Miller made it clear to me that 5x is supporting cloud with ramdir, I have started playing with it and it seemed working smoothly, except a weird behaviour.. here is the story of it: Basically, I have pulled the code and built solr 5x, and the replace the war file in webapps

Re: solr autocomplete requirement

2012-11-19 Thread Sujatha Arun
Anyone with suggestions on this? On Mon, Nov 19, 2012 at 10:13 PM, Sujatha Arun suja.a...@gmail.com wrote: Hi, Our requirement for auto complete is slightly complicated , We need two types of auto complete 1. Meta data Auto complete 2. Full text Content Auto complete In addition the

Re: Custom ranking solutions?

2012-11-19 Thread Floyd Wu
Hi Otis, I'm doing some test like this, http://localhost:8983/solr/select/?fl=score,_l_unique_keydefType=funcq=product(abs(rankingField),abs(score))http://localhost:8983/solr/select/?fl=score,_l_unique_keydefType=funcq=product(abs(ranking),abs(score)) and I get following response, lst

configuring solr xml as a datasource

2012-11-19 Thread Leena Jawale
Hi, I am new to solr. I am trying to use solr xml data source for solr search engine. I have created test.xml file as - add doc field name=fnameleena1/field field name=number101/field /doc /add I have created data-config.xml file dataConfig dataSource type=FileDataSource encoding=UTF-8

Timeout when calling Luke request handler after migrating from Solr 3.5 to 3.6.1

2012-11-19 Thread Jose Aguilar
Hi all, As part of our business logic we query the Luke request handler to extract the fields in the index from our code using the following url: http://server:8080/solr/admin/luke?wt=jsonnumTerms=0 This worked fine with Solr 3.5, but now with 3.6.1 this call never returns, it hangs, and