Re: Solr PHP PECL Extension going to Stable Release - Wishing for Any New Features?

2010-10-11 Thread Lukas Kahwe Smith
On 11.10.2010, at 07:03, Israel Ekpo wrote: I am currently working on a couple of bug fixes for the Solr PECL extension that will be available in the next release 0.9.12 sometime this month. http://pecl.php.net/package/solr Documentation of the current API and features for the PECL

deleteByQuery issue

2010-10-11 Thread Claudio Atzori
Hi everybody, in my application I use an instance of EmbeddedSolrServer (solr 1.4.1), the following snippet shows how I am instantiating it: File home = new File(indexDataPath(solrDataDir, indexName)); container = new CoreContainer(indexDataPath(solrDataDir, indexName));

Solr start in server

2010-10-11 Thread Yavuz Selim YILMAZ
I have a solr installation on a server. I start it with the help of putty ( with the start.jar). But when I close the putty instance, automatically solr instance also closes. How can I solve this problem? I mean, I close connection with server, but solr instance still runs? -- Yavuz Selim YILMAZ

question about SolrCore

2010-10-11 Thread Li Li
hi all, I want to know the detail of IndexReader in SolrCore. I read a little codes of SolrCore. Here is my understanding, are they correct? Each SolrCore has many SolrIndexSearcher and keeps them in _searchers. and _searcher keep trace of the latest version of index. Each

Re: How to manage different indexes for different users

2010-10-11 Thread Li Li
will one user search other user's index? if not, you can use multi cores. 2010/10/11 Tharindu Mathew mcclou...@gmail.com: Hi everyone, I'm using solr to integrate search into my web app. I have a bunch of users who would have to be given their own individual indexes. I'm wondering whether

Re: Multiple masters and replication between masters?

2010-10-11 Thread Arunkumar Ayyavu
Thanks Otis. That was helpful. On Mon, Oct 11, 2010 at 9:19 AM, Otis Gospodnetic otis_gospodne...@yahoo.com wrote: Arun, Yes, changing the solrconfig.xml to point to the new master could require a restart. However, if you use logical addresses (VIPs in the Load Balancer or even local

How to get Term Frequency

2010-10-11 Thread Ahson Iqbal
hi All I have a question that how could somebody get term frequency as we do get in lucene by the following method DocFreq(new Term(Field, value)); using solr/solrnet.

Re: Solr start in server

2010-10-11 Thread Yavuz Selim YILMAZ
I solved it nohup java -jar start.jar Thnx. -- Yavuz Selim YILMAZ 2010/10/11 Gora Mohanty g...@mimirtech.com On Mon, Oct 11, 2010 at 1:23 PM, Yavuz Selim YILMAZ yvzslmyilm...@gmail.com wrote: I use AIX 5.3. How can I handle? [...] Have not used AIX in ages, but this should work,

Re: Index time boosting is not working with boosting value in document level

2010-10-11 Thread Shanmugavel SRD
Eric, Score is not coming properly even after giving boost value in document and field level. Please find the solrconfig.xml, schema.xml, data-config.xml, the feed and the score query. Doc with id 'ABCDEF/L' is boosted and doc with id 'MA147LL/A' is not boosted, but both are returning

Search within a subset of documents

2010-10-11 Thread Sergey Bartunov
Is it possible to use Solr for searching within a subset of documents represented by enumeration of document IDs?

Re: How to get Term Frequency

2010-10-11 Thread Ahmet Arslan
I have a question that how could somebody get term frequency as we do get in lucene by the following method DocFreq(new Term(Field, value)); using solr/solrnet. You can get term frequency with http://wiki.apache.org/solr/TermVectorComponent. If you are interested in document frequency,

Re: deleteByQuery issue

2010-10-11 Thread Ahmet Arslan
--- On Mon, 10/11/10, Claudio Atzori claudio.atz...@isti.cnr.it wrote: From: Claudio Atzori claudio.atz...@isti.cnr.it Subject: deleteByQuery issue To: solr-user@lucene.apache.org Date: Monday, October 11, 2010, 10:38 AM Hi everybody, in my application I use an instance of

KStemmer for Solr

2010-10-11 Thread Bernd Fehling
Because I'm using solr from trunk and not from lucid imagination I was missing KStemmer. So I decided to add this stemmer to my installation. After some modifications KStemmer is now working fine as stand-alone. Now I have a KStemmerFilter. Next will be to write the KStemmerFilterFactory. I

Re: Index time boosting is not working with boosting value in document level

2010-10-11 Thread Ahmet Arslan
Eric,    Score is not coming properly even after giving boost value in document and field level.    Please find the solrconfig.xml, schema.xml, data-config.xml, the feed and the score query.    Doc with id 'ABCDEF/L' is boosted and doc with id 'MA147LL/A' is not boosted, but both are

Re: KStemmer for Solr

2010-10-11 Thread Ahmet Arslan
Because I'm using solr from trunk and not from lucid imagination I was missing KStemmer. So I decided to add this stemmer to my installation. After some modifications KStemmer is now working fine as stand-alone. Now I have a KStemmerFilter. Next will be to write the

Re: Search within a subset of documents

2010-10-11 Thread Sergey Bartunov
Whether it will be enough effective if the subset is really large? On 11 October 2010 18:39, Gora Mohanty g...@mimirtech.com wrote: On Mon, Oct 11, 2010 at 7:00 PM, Sergey Bartunov sbos@gmail.com wrote: Is it possible to use Solr for searching within a subset of documents represented by

facet.method: enum vs. fc

2010-10-11 Thread Paolo Castagna
Hi, I am using Solr v1.4 and I am not sure which facet.method I should use. What should I use if I do not know in advance if the number of values for a given field will be high or low? What are the pros/cons of using facet.method=enum vs. facet.method=fc? When should I use enum vs. fc? I have

Re: deleteByQuery issue

2010-10-11 Thread Claudio Atzori
On 10/11/2010 04:06 PM, Ahmet Arslan wrote: --- On Mon, 10/11/10, Claudio Atzoriclaudio.atz...@isti.cnr.it wrote: From: Claudio Atzoriclaudio.atz...@isti.cnr.it Subject: deleteByQuery issue To: solr-user@lucene.apache.org Date: Monday, October 11, 2010, 10:38 AM Hi everybody, in my

Re: How to manage different indexes for different users

2010-10-11 Thread Markus Jelsma
Then you probably read on how to create [1] the new core. Keep in mind, you might need to do some additional local scripting to create a new instance dir. Do the user share the same schema? If so, you'd be better of keeping a single index and preventing the users from querying others. [1]:

Re: Search within a subset of documents

2010-10-11 Thread Gora Mohanty
On Mon, Oct 11, 2010 at 8:20 PM, Sergey Bartunov sbos@gmail.com wrote: Whether it will be enough effective if the subset is really large? [...] If the subset of IDs is large, and disjoint (so that you cannot use ranges), the query might look ugly, but generating it should not be much of a

Re: Problem with Indexing

2010-10-11 Thread Gora Mohanty
On Mon, Oct 11, 2010 at 1:27 PM, Jörg Agatz joerg.ag...@googlemail.com wrote: ok, i have try it.. and now iget this error: POSTing file e067f59c-d046-11df-b552-000c29e17baa_SEARCH.xml SimplePostTool: FATAL: Solr returned an error:

Re: facet.method: enum vs. fc

2010-10-11 Thread Erick Erickson
Yep, that was probably the best choice It's a classic time/space tradeoff. The enum method creates a bitset for #each# unique facet value. The bit set is (maxdocs / 8) bytes in size (I'm ignoring some overhead here). So if your facet field has 10 unique values, and 8M documents, you'll use up

Re: How to manage different indexes for different users

2010-10-11 Thread Tharindu Mathew
On Mon, Oct 11, 2010 at 10:48 PM, Markus Jelsma markus.jel...@openindex.iowrote: Then you probably read on how to create [1] the new core. Keep in mind, you might need to do some additional local scripting to create a new instance dir. Do the user share the same schema? If so, you'd be

Re: How to manage different indexes for different users

2010-10-11 Thread Markus Jelsma
Well, set the user ID for each document and use a filter query to filter only on field:current_user_id. On Mon, 11 Oct 2010 23:25:29 +0530, Tharindu Mathew mcclou...@gmail.com wrote: On Mon, Oct 11, 2010 at 10:48 PM, Markus Jelsma wrote: Then you probably read on how to create [1] the new

Re: Sorting on arbitary 'custom' fields

2010-10-11 Thread Simon Wistow
On Sat, Oct 09, 2010 at 06:31:19PM -0400, Erick Erickson said: I'm confused. What do you mean that a user can set any number of arbitrarily named fields on a document. It sounds like you are talking about a user adding arbitrarily may entries to a multi-valued field? Or is it some kind of

Re: How to manage different indexes for different users

2010-10-11 Thread Tharindu Mathew
Great! Just what I need. Thanks for all the help. I'll let you know how it goes. On Mon, Oct 11, 2010 at 11:37 PM, Markus Jelsma markus.jel...@openindex.iowrote: Well, set the user ID for each document and use a filter query to filter only on field:current_user_id. On Mon, 11 Oct 2010

Solr unresponsive but still taking queries

2010-10-11 Thread Hitendra Molleti
Hi, We are running a CMS based on Java and use Solr 1.4 as the indexer. Till today afternoon things were fine until we hit this Solr issue where it sort of becomes unresponsive. We tried to stop and restart Solr but no help. When we look into the logs Solr is receiving queries and

Re: Prioritizing advectives in solr search

2010-10-11 Thread Chris Hostetter
: here is my scenario, im using dismax handler and my understanding is when I : query Blue hammer, solr brings me results for blue hammer, blue and : hammer, and in the same hierarchy, which is understandable, is there any : way I can manage the blue keyword, so that solr searches for blue hammer

data import / delta question

2010-10-11 Thread Tim Heckman
My data-import-config.xml has a parent entity and a child entity. The data is coming from rdbms's. I'm trying to make use of the delta-import feature where a change in the child entity can be used to regenerate the entire document. The child entity is on a different database (and a different

Re: CoreContainer Usage

2010-10-11 Thread Amit Nithian
Hi sorry perhaps my question wasn't very clear. Basically I am trying to build a federated search where I blend the results of queries to multiple cores together. This is like distributed search but I believe the distributed search will issue network calls which I would like to avoid. I have read

Re: Prioritizing adjectives in solr search

2010-10-11 Thread Erick Erickson
You can do some interesting things with payloads. You could index a particular value as the payload that identified the kind of word it was, where kind is something you define. Then at query time, you could boost depending on what part kind of word you identified it as in both the query and at

Deleting Documents with null fields by query

2010-10-11 Thread Claudio Devecchi
Hi everybody, I'm trying to delete by query some documents with null content (this happened because I crawled my intranet and somethings came null) When I try this works fine (I'm deleting from my solr index every document that dont have wiki on the field content) curl

Re: deleteByQuery issue

2010-10-11 Thread Erick Erickson
I'd guess that after you delete your documents and commit, you're still using an IndexReader that you haven't reopened when you search. WARNING: I'm not all that familiar with EmbeddedSolrServer, so this may be way off base. HTH Erick On Mon, Oct 11, 2010 at 12:04 PM, Claudio Atzori

Re: Disable (or prohibit) per-field overrides

2010-10-11 Thread Erick Erickson
Have you looked at invariants in solrconfig.xml? Best Erick On Mon, Oct 11, 2010 at 12:23 PM, Markus Jelsma markus.jel...@openindex.iowrote: Hi, Anyone knows useful method to disable or prohibit the per-field override features for the search components? If not, where to start to make it

Re: Disable (or prohibit) per-field overrides

2010-10-11 Thread Markus Jelsma
Yes, we're using it but the problem is that there can be many fields and that means quite a large list of parameters to set for each request handler, and there can be many request handlers. It's not very practical for us to maintain such big set of invariants. Thanks On Mon, 11 Oct 2010

Re: Solr unresponsive but still taking queries

2010-10-11 Thread Erick Erickson
The first question is what's been changing? I suspect something's been growing right along and finally tripped you up. Places I would look first: 1 how much free space is on your disk? Have your logs (or other files) grown without bound? 2 If this is a Unix box, what does top report? In other

Re: data import / delta question

2010-10-11 Thread Erick Erickson
Without seeing your DIH config, it's really hard to say much of anything. You can gain finer control over edge cases by writing a Java app that uses SolrJ if necessary. HTH Erick On Mon, Oct 11, 2010 at 3:27 PM, Tim Heckman theck...@gmail.com wrote: My data-import-config.xml has a parent

Re: Deleting Documents with null fields by query

2010-10-11 Thread Erick Erickson
Have you tried something like: 'deletequery*:* AND -content:[* TO *]query/delete On Mon, Oct 11, 2010 at 4:01 PM, Claudio Devecchi cdevec...@gmail.comwrote: Hi everybody, I'm trying to delete by query some documents with null content (this happened because I crawled my intranet and

Re: Disable (or prohibit) per-field overrides

2010-10-11 Thread Erick Erickson
I'm clueless in that case, because you're right, that's a lot of picky maintenance Sorry 'bout that Erick On Mon, Oct 11, 2010 at 4:18 PM, Markus Jelsma markus.jel...@openindex.iowrote: Yes, we're using it but the problem is that there can be many fields and that means quite a large list

Re: Deleting Documents with null fields by query

2010-10-11 Thread Claudio Devecchi
yes.. dont work, doing it I erase all the content. :( or, another thing that will help me is to make a query that doesnt bring the null one. tks On Mon, Oct 11, 2010 at 5:27 PM, Erick Erickson erickerick...@gmail.comwrote: Have you tried something like: 'deletequery*:* AND -content:[* TO

Re: Deleting Documents with null fields by query

2010-10-11 Thread Erick Erickson
erase all the content. Oops. first, I should look more carefully. You don't want the AND in there, use deletequery*:* -content:[* TO *]query/delete In general, don't mix and match booleans and native Lucene query syntax... Before sending this to Solr, what do you get back when you try just the

Re: data import / delta question

2010-10-11 Thread Tim Heckman
Thanks, Erick. I was starting to think I may have to go the SolrJ route. Here's a simplified version of my DIH config showing what I'm trying to do. dataConfig dataSource name=PROD type=JdbcDataSource driver=com.microsoft.sqlserver.jdbc.SQLServerDriver

weighted facets

2010-10-11 Thread Peter Karich
Hi, I need a feature which is well explained from Mr Goll at this site ** So, it then would be nice to do sth. like: facet.stats=sum(fieldX)facet.stats.sort=fieldX And the output (sorted against the sum-output) can look sth. like this: lst name=facet_counts lst name=facet_fields lst

Re: StatsComponent and multi-valued fields

2010-10-11 Thread Chris Hostetter
: I'm able to execute stats queries against multi-valued fields, but when : given a facet, the statscomponent only considers documents that have a facet : value as the last value in the field. : : As an example, imagine you are running stats on fooCount, and you want to : facet on bar, which is

Re: having problem about Solr Date Field.

2010-10-11 Thread Chris Hostetter
: Of course if your index is for users in one time zone only, you may : insert the local time to Solr, and everything will work well. However, This is a bad assumption to make -- it will screw you up if your one time zone has anything like Daylight Saving Time (Because UTC Does not) -Hoss

Re: configuring custom CharStream in solr

2010-10-11 Thread Koji Sekiguchi
(10/10/12 5:57), Michael Sokolov wrote: I would like to inject my CharStream (or possibly it could be a CharFilter; this is all in flux at the moment) into the analysis chain for a field. Can I do this in solr using the Analyzer configuration syntax in schema.xml, or would I need to define my

Re: How to get line numbers from Solr plugin to show up in stack trace

2010-10-11 Thread Chris Hostetter
: Hello, I am writing a clustering component for Solr. It registers, loads and : works properly. However, whenever there is an exception inside my plugin, I : cannot get tomcat to show me the line numbers. It always says Unknown source : for my classes. The stack trace in tomcat shows line

Re: Where is the lock file?

2010-10-11 Thread Chris Hostetter
: I've looked through the configuration file. I can see where it defines the : lock type and I can see the unlock configuration. But I don't see where it : specifies the lock file. Where is it? What is its name? as mentioned in the stack trace you pasted, the name of the lock file in question

Re: Records from DIH not easily queried for

2010-10-11 Thread Dennis Gearon
Well, found the problem, us of course. We were using string instead of text for the field type in the schema config file. So it wasn't tokenizing words or doing other 'search by word' enabling preprocessing before storing the document in the index. We could have only found whole sentences.

Re: having problem about Solr Date Field.

2010-10-11 Thread Dennis Gearon
So, regarding DST, do you put everything in GMT, and make adjustments for in the 'seach for/between' data/time values before the query for both DST and TZ? Dennis Gearon Signature Warning It is always a good idea to learn from your own mistakes. It is usually a better idea to

Re: Trouble with exception Document [Null] missing required field DocID

2010-10-11 Thread Chris Hostetter
: Right. You're requiring that every document have an ID (via uniqueKey), but : there's nothing : magic about DIH that'll automagically parse a PDF file and map something : into your ID : field. : : So you have to create a unique ID before you send your doc to Curl. I'm a) This example isn't

multicore replication slave

2010-10-11 Thread Christopher Bottaro
Hello, I can't get my multicore slave to replicate from the master. The master is setup properly and the following urls return 00OKNo command as expected: http://solr.mydomain.com:8983/solr/core1/replication http://solr.mydomain.com:8983/solr/core2/replication

LuceneRevolution - NoSQL: A comparison

2010-10-11 Thread Peter Keegan
I listened with great interest to Grant's presentation of the NoSQL comparisons/alternatives to Solr/Lucene. It sounds like the jury is still out on much of this. Here's a use case that might favor using a NoSQL alternative for storing 'stored fields' outside of Lucene. When Solr does a

Re: configuring custom CharStream in solr

2010-10-11 Thread Michael Sokolov
On 10/11/2010 6:41 PM, Koji Sekiguchi wrote: (10/10/12 5:57), Michael Sokolov wrote: I would like to inject my CharStream (or possibly it could be a CharFilter; this is all in flux at the moment) into the analysis chain for a field. Can I do this in solr using the Analyzer configuration

Re: configuring custom CharStream in solr

2010-10-11 Thread Michael Sokolov
On 10/11/2010 8:38 PM, Michael Sokolov wrote: On 10/11/2010 6:41 PM, Koji Sekiguchi wrote: (10/10/12 5:57), Michael Sokolov wrote: I would like to inject my CharStream (or possibly it could be a CharFilter; this is all in flux at the moment) into the analysis chain for a field. Can I do

Re: configuring custom CharStream in solr

2010-10-11 Thread Chris Hostetter
: OK - I found the answer pecking through the source - apparently the name of : the element to configure a CharFilter is charFilter - fancy that :) there's even an example, right there on the wiki... http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#CharFilterFactories -Hoss

Re: LuceneRevolution - NoSQL: A comparison

2010-10-11 Thread Yonik Seeley
On Mon, Oct 11, 2010 at 8:32 PM, Peter Keegan peterlkee...@gmail.com wrote: I listened with great interest to Grant's presentation of the NoSQL comparisons/alternatives to Solr/Lucene. It sounds like the jury is still out on much of this. Here's a use case that might favor using a NoSQL

Re: LuceneRevolution - NoSQL: A comparison

2010-10-11 Thread Dennis Gearon
Well, I think that if some is searching the 'whole of the dataset' to find the 'individual data' then an SQL database outside of Solr makes as much sense. There's plenty of data in the world or most applications that needs to stay normalized or at least has benefits to being that way.

Re: LuceneRevolution - NoSQL: A comparison

2010-10-11 Thread Dennis Gearon
It sounds, of course, a lot like transaction isolation using MVCC. It's the obvious solution, and has been for since the late 1970's. I hope it won't be too hard to convince people to use it :-) It's been the reason for the early success of Oracle. Dennis Gearon Signature Warning

Re: configuring custom CharStream in solr

2010-10-11 Thread Michael Sokolov
On 10/11/2010 10:18 PM, Chris Hostetter wrote: : OK - I found the answer pecking through the source - apparently the name of : the element to configure a CharFilter ischarFilter - fancy that :) there's even an example, right there on the wiki...

Re: facet.method: enum vs. fc

2010-10-11 Thread Paolo Castagna
Thank you Erick, your explanation was helpful. I'll stick with fc and come back to this later if I need further tuning. Paolo Erick Erickson wrote: Yep, that was probably the best choice It's a classic time/space tradeoff. The enum method creates a bitset for #each# unique facet value.