Re: filter query parsing problem

2010-01-20 Thread Shalin Shekhar Mangar
On Tue, Jan 19, 2010 at 3:10 AM, Ahmet Arslan iori...@yahoo.com wrote: I am submitting a query and it seems to be parsing incorrectly. Here is the query with the debug output. Any ideas what the problem is: arr name=filter_queries str ((VLog:814124 || VLog:12342)

Re: How to backup / dump solr database

2010-01-20 Thread Shalin Shekhar Mangar
On Tue, Jan 19, 2010 at 6:38 PM, jmf jeanmichel.franc...@makina-corpus.comwrote: Hi, I'm using solr with the Plone CMS. I have just following some tutorials, and I would like to 'dump' the solr database on production server and make it run on my developement environement. Both are linux.

AW: Restricting Facet to FilterQuery in combination with mincount

2010-01-20 Thread Chantal Ackermann
Thank you, Chris! That did clarify it. :-) Cheers, Chantal Von: Chris Hostetter [hossman_luc...@fucit.org] Gesendet: Dienstag, 19. Januar 2010 23:27 An: solr-user@lucene.apache.org Betreff: Re: Restricting Facet to FilterQuery in combination with mincount

AW: TermsComponent, multiple fields, total count

2010-01-20 Thread Chantal Ackermann
I find the DismaxRequestHandler perfect for matching multiple fields, matching phrases in other/subset of fields, weighting the different matches. It's powerful and fast. You can define several DismaxRequestHandlers if you want to offer different kinds of search areas to the user (e.g. search

Re: Please help: Failing tests

2010-01-20 Thread Shalin Shekhar Mangar
On Wed, Jan 20, 2010 at 2:26 AM, Siv Anette Fjellkårstad s...@steria.nowrote: I'm tring to run the unit tests from Eclipse. Almost half the tests are failing, and I don't know what I'm doing wrong. This is what I've done: 1. Checked out the code outside Eclipse's workspace 2. File New

Ruby client fails to build

2010-01-20 Thread Siddhant Goel
Hi, I'm using Solr 1.4 (and trying to use the Ruby client (solr-ruby) to access it). The problem is that I just cant get it to work. :-) If I run the tests (rake test), it fails giving me the following output - /path/to/solr-ruby/test/unit/delete_test.rb:52: invalid multibyte char (US-ASCII)

Re: Fastest way to use solrj

2010-01-20 Thread Noble Paul നോബിള്‍ नोब्ळ्
2010/1/20 Tim Terlegård tim.terleg...@gmail.com: BinaryRequestWriter does not read from a file and post it Is there any other way or is this use case not supported? I tried this: $ curl host/solr/update/javabin -F stream.file=/tmp/data.bin $ curl host/solr/update -F stream.body=' commit /'

SV: Please help: Failing tests

2010-01-20 Thread Siv Anette Fjellkårstad
Thank you so much - that helped a lot. Now most of the tests are green, but I still have some failing. One of failing tests is testMultiThreade and the error messages is: Caused by: org.apache.solr.common.SolrException: QueryElevationComponent missing config file: 'elevate.xml either:

Re: Ruby client fails to build

2010-01-20 Thread Erik Hatcher
Where are you getting your solr-ruby code from? You can simply gem install it to pull in an already pre-built gem. I just ran the tests on trunk, all passed, with the output pasted below. Erik ~/dev/solr/client/ruby/solr-ruby: rake test (in

Re: Ruby client fails to build

2010-01-20 Thread Siddhant Goel
On Wed, Jan 20, 2010 at 4:19 PM, Erik Hatcher erik.hatc...@gmail.comwrote: Where are you getting your solr-ruby code from? You can simply gem install it to pull in an already pre-built gem. I'm just picking it up from the 1.4 release. I also tried checking out the latest copy from svn, but

Re: Ruby client fails to build

2010-01-20 Thread Erik Hatcher
On Jan 20, 2010, at 6:32 AM, Siddhant Goel wrote: On Wed, Jan 20, 2010 at 4:19 PM, Erik Hatcher erik.hatc...@gmail.comwrote: Where are you getting your solr-ruby code from? You can simply gem install it to pull in an already pre-built gem. I'm just picking it up from the 1.4 release. I

big index vs. lots of small ones

2010-01-20 Thread Thorsten Scherler
Hi all, I have to do an analyses about following usecase. I am working as consultant in a public company. We are talking about to offer in the future each public institution its own search server (probably) based on Apache Solr. However the user of our portal should be able to search all

Re: filter query parsing problem

2010-01-20 Thread Ahmet Arslan
If they are really filter queries i.e. specified through fq then they will not be run through an analyzer. Does this mean filter queries are not analyzed? The query below returns a document.

LucidGaze, No Data

2010-01-20 Thread Markus Jelsma
Hello all, I have installed and reconfigured everything according to the readme supplied with the recent LucidGaze release. Files have been written in the gaze directory in SOLR_HOME but the *.log.x.y files are all empty! The rrd directory does contain something that is about 24MiB. In the

Re: filter query parsing problem

2010-01-20 Thread Erik Hatcher
On Jan 20, 2010, at 8:11 AM, Ahmet Arslan wrote: If they are really filter queries i.e. specified through fq then they will not be run through an analyzer. Does this mean filter queries are not analyzed? The query below returns a document.

Re: filter query parsing problem

2010-01-20 Thread Shalin Shekhar Mangar
On Wed, Jan 20, 2010 at 7:40 PM, Erik Hatcher erik.hatc...@gmail.comwrote: On Jan 20, 2010, at 8:11 AM, Ahmet Arslan wrote: If they are really filter queries i.e. specified through fq then they will not be run through an analyzer. Does this mean filter queries are not analyzed? The

Re: filter query parsing problem

2010-01-20 Thread Erik Hatcher
On Jan 20, 2010, at 9:34 AM, Shalin Shekhar Mangar wrote: On Wed, Jan 20, 2010 at 7:40 PM, Erik Hatcher erik.hatc...@gmail.comwrote: On Jan 20, 2010, at 8:11 AM, Ahmet Arslan wrote: If they are really filter queries i.e. specified through fq then they will not be run through an

Re: Unstemming after solr.PorterStemFilterFactory

2010-01-20 Thread Bogdan Vatkov
Hi Eric, I think I realize that and I am actually using this - I am using the stemmed, cased etc. token from the stored term vectors and additionally I am using the field values. But the fields values are different from the tokens in the level of granularity. When I access the term vector for my

Re: Extracting URLs while indexing

2010-01-20 Thread Bogdan Vatkov
Sorry, I meant completely server-side - even more I want that at indexing time (I do not care about query-time as I am reading later the whole index anyway). On Wed, Jan 20, 2010 at 2:40 AM, Erick Erickson erickerick...@gmail.comwrote: Do you mean you want the URLs to be extracted on the

Field collapsing works but is tree modeling possible?

2010-01-20 Thread Kelly Taylor
I'm currently using the latest SOLR-236 patch (12/24/2009) and field-collapsing seems to be giving me the desired results, but I'm wondering if I should focus more on a tree view of my catalog data instead, as described in Beyond Basic Faceted Search Could either of the patches for SOLR-792 or

Replication clients logs in solr 1.4

2010-01-20 Thread Jérôme Etévé
Hi All, I'm using the build in replication with master/slave(s) Solr and the indices are replicating just fine. Just something troubles me: Nothing happens in my logs/ directory .. On the slave(s), no logs/snapshot.current file. And on the master, nothing either appears on logs/clients/ The

Solr query single entity?

2010-01-20 Thread fredanthony
Hi, I have Solr setup to use a DataImportHandler with my database. In the data-config.xml file I have one document with two entities as follows: document entity name=users query=SELECT user_id, user_id as pk_field, user_name FROM users

Re: big index vs. lots of small ones

2010-01-20 Thread Marc Sturlese
Check out this patch witch solve the distributed IDF's problem: https://issues.apache.org/jira/browse/SOLR-1632 I think it fixes what you are explaining. The price you pay is that there are 2 requests per shard. If I am not worng the first is to get term frequencies and needed info and the second

Re: Unstemming after solr.PorterStemFilterFactory

2010-01-20 Thread Erick Erickson
Ah, OK. I take the unnecessary comment back. If you require the original form of the tokens (not just the original text), then you do have to do something to preserve them, so I think you're on the right track FWIW Erick On Wed, Jan 20, 2010 at 9:38 AM, Bogdan Vatkov

Re: Extracting URLs while indexing

2010-01-20 Thread Erick Erickson
I guess it depends on what you mean by extract. There's nothing that I know of that, say, stores them to a file or separate field, or even does anything special with them. I think StandardTokenizerFactory tries to keep URLs together as a token in the field, but it's just another token... You

Re: [1.3] help with update timeout issue?

2010-01-20 Thread Jerome L Quinn
Lance Norskog goks...@gmail.com wrote on 01/16/2010 12:43:09 AM: If your indexing software does not have the ability to retry after a failure, you might with to change the timeout from 20 seconds to, say, 5 minutes. I can make it retry, but I have somewhat real-time processes doing these

Re: Replication clients logs in solr 1.4

2010-01-20 Thread Jérôme Etévé
Oops. Ok my mistakes. The logs are actually for the solr 1.3 system scripts based distribution only. And the config files synchronize only on change .. J. 2010/1/20 Jérôme Etévé jerome.et...@gmail.com: Hi All, I'm using the build in replication with master/slave(s) Solr and the indices

Re: Unstemming after solr.PorterStemFilterFactory

2010-01-20 Thread Bogdan Vatkov
Thanks! It is good to know I did not do something in vаin :) On Wed, Jan 20, 2010 at 6:54 PM, Erick Erickson erickerick...@gmail.comwrote: Ah, OK. I take the unnecessary comment back. If you require the original form of the tokens (not just the original text), then you do have to do something

Re: Extracting URLs while indexing

2010-01-20 Thread Bogdan Vatkov
I am not absolutely sure about what I am saying but I think after tokenization I get the URLs as single tokens but with all the interesting symbols :) like /,: removed from the token. Is it normal? Is there a chance I misconfigured something? Best regards, Bogdan On Wed, Jan 20, 2010 at 7:03 PM,

Re: Extracting URLs while indexing

2010-01-20 Thread Erick Erickson
That's really hard to say without seeing your configuration G... If your field has WordDelimiterFactory with the proper catenate options set to one, that'd do it. Can you post the relevant parts of your schema? Erick On Wed, Jan 20, 2010 at 12:46 PM, Bogdan Vatkov bogdan.vat...@gmail.comwrote:

Re: Extracting URLs while indexing

2010-01-20 Thread Bogdan Vatkov
that is the field type: fieldType name=body_text class=solr.TextField positionIncrementGap=100 analyzer type=index tokenizer class=solr.WhitespaceTokenizerFactory/ !-- in this example, we will only use synonyms at query time filter class=solr.SynonymFilterFactory

Need help : Solr configuration issue for sorting on title field

2010-01-20 Thread EL KASMI Hicham
Hello, We have a problem with sorting on title field in Solr instance of our production repository, we get the error message: HTTP Status 500 - there are more terms than documents in field titleStr, but it's impossible to sort on tokenized fields. After some googling and searching in this

Re: Extracting URLs while indexing

2010-01-20 Thread Erick Erickson
You really need to have this page as a handy reference. http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters http://wiki.apache.org/solr/AnalyzersTokenizersTokenFiltersLook in particular at what happens with WordDelimiterFilterFactory, you're breaking your tokens up on non-alpha

Re: solr blocking on commit

2010-01-20 Thread Jerome L Quinn
ysee...@gmail.com wrote on 01/19/2010 06:05:45 PM: On Tue, Jan 19, 2010 at 5:57 PM, Steve Conover scono...@gmail.com wrote: I'm using latest solr 1.4 with java 1.6 on linux.  I have a 3M document index that's 10+GB.  We currently give solr 12GB of ram to play in and our machine has 32GB

Re: solr blocking on commit

2010-01-20 Thread Yonik Seeley
On Wed, Jan 20, 2010 at 2:18 PM, Jerome L Quinn jlqu...@us.ibm.com wrote: This is essentially the same problem I'm fighting with.  Once in a while, commit causes everything to freeze, causing add commands to timeout. This could be a bit different. Commits do currently block other update

Re: solr blocking on commit

2010-01-20 Thread Jerome L Quinn
ysee...@gmail.com wrote on 01/20/2010 02:24:04 PM: On Wed, Jan 20, 2010 at 2:18 PM, Jerome L Quinn jlqu...@us.ibm.com wrote: This is essentially the same problem I'm fighting with.  Once in a while, commit causes everything to freeze, causing add commands to timeout. This could be a bit

Re: Extracting URLs while indexing

2010-01-20 Thread Bogdan Vatkov
Now I see I didn't review all the config that I took from the default config. Removed the WordDelimiterFilter and the StandardTokenizer seems to keep URLs but splits relative paths (e.g. /file/location/file.txt) and I want to keep such as single token. Any ideas? On Wed, Jan 20, 2010 at 8:13 PM,

filter querying working on dynamic int fields but not dynamic string fields?

2010-01-20 Thread Tommy Chheng
I'm having trouble doing a filter query on a string field. Any ideas why it's working on dynamic int fields but not dynamic string fields? ex. http://localhost:8983/solr/select?indent=onversion=2.2q=climate - correct

Re: filter querying working on dynamic int fields but not dynamic string fields?

2010-01-20 Thread Erik Hatcher
On Jan 20, 2010, at 4:27 PM, Tommy Chheng wrote: I'm having trouble doing a filter query on a string field. Any ideas why it's working on dynamic int fields but not dynamic string fields? ex. http://localhost:8983/solr/select?indent=onversion=2.2q=climate - correct

Re: filter querying working on dynamic int fields but not dynamic string fields?

2010-01-20 Thread Rob Casson
http://localhost:8983/solr/select?indent=onversion=2.2q=climatefq=awardinstrument_s:Continuing+grant str name=awardinstrument_sContinuing grant /str everything that erik already mentioned, but looks like you also have a trailing space in the document, so even quoting it would require that last

Re: filter querying working on dynamic int fields but not dynamic string fields?

2010-01-20 Thread Tommy Chheng
Thanks, quoting it fixed it. I'm also going to strip the leading/trailing whitespace at index time. Tommy On 1/20/10 1:47 PM, Erik Hatcher wrote: On Jan 20, 2010, at 4:27 PM, Tommy Chheng wrote: I'm having trouble doing a filter query on a string field. Any ideas why it's working on

Re: Contributors - Solr in Action Case Studies

2010-01-20 Thread Tom Burton-West
Hello Otis, Hi Otis, We are using Solr to provide indexing for the full text of 5 million books (About 4-6 terrabytes of text.) Our index is currently around 3 terrabytes distributed over 10 shards with about 310 GB of index per shard. We are using very large Solr documents (about 750MB of

Re: Need help : Solr configuration issue for sorting on title field

2010-01-20 Thread Chris Hostetter
: Subject: Need help : Solr configuration issue for sorting on title field : In-Reply-To: e9c993f11001200946j73dada43v5b7c7769d9e76...@mail.gmail.com : References: e9c993f11001191448k683db9fbud56276361ae20...@mail.gmail.com : 359a92831001191640v7c063e28y8b3376b71ec3d...@mail.gmail.com :

Re: build path

2010-01-20 Thread Chris Hostetter
: Subject: build path : References: 219927.42092...@web52905.mail.re2.yahoo.com http://people.apache.org/~hossman/#threadhijack Thread Hijacking on Mailing Lists When starting a new discussion on a mailing list, please do not reply to an existing message, instead start a fresh email. Even if

Re: solr blocking on commit

2010-01-20 Thread Jerome L Quinn
ysee...@gmail.com wrote on 01/20/2010 02:24:04 PM: On Wed, Jan 20, 2010 at 2:18 PM, Jerome L Quinn jlqu...@us.ibm.com wrote: This is essentially the same problem I'm fighting with.  Once in a while, commit causes everything to freeze, causing add commands to timeout. This could be a bit

Problems with spellchecker

2010-01-20 Thread Simon Wistow
The spellchecker in my 1.4 install started behaving increasingly erratically andsuggestions would only be returned some of the time with the same query. I tried to force a rebuild using spellcheck.build=yes The full request being /select/?q=alexandr the great indent=on fl=title

Re: Dynamic boosting of ids at search time

2010-01-20 Thread Lance Norskog
http://www.lucidimagination.com/search/document/CDRG_ch04_4.4.4?q=ExternalFileField This lets you make a file with a boost value for every document. You can change the file and reload the new values with a commit. It hasn't been materially changed since 2007 and there are no unit tests, so it

filter query granularity

2010-01-20 Thread Wangsheng Mei
The following 3 search senarioes: bla:A bla:B bla:A OR bla:B are quite common, so I use 3 filter queries: fq=bla:A fq=bla:B fq=bla:A OR bla:B My question is, since the last fq documents set will be build from the first two fq doc sets, will solr still cache the last fq doc set or it just

Re: Does specifying a smaller number of rows in search improve efficiency?

2010-01-20 Thread Lance Norskog
The data in stored fields that is fetched back is in different files than the index data. So, when you ask for documents you are asking for more disk i/o. The different fields are in different places on the disk, so if you request only 1 out of 20 fields, the query will be slightly faster. I once

Re: filter query granularity

2010-01-20 Thread Lance Norskog
The docset for fq=bla:A OR bla:B has no relation to the other two. Different 'fq' filters are made and cached separately. The first time you search with a filter query, Solr does that query and saves the list of documents matching the search. 2010/1/20 Wangsheng Mei hairr...@gmail.com: The

Solr Analysis Webinar Jan 28, 2010

2010-01-20 Thread Jay Hill
My colleague at Lucid Imagination, Tom Hill, will be presenting a free webinar focused on analysis in Lucene/Solr. If you're interested, please sign up and join us. Here is the official notice: We'd like to invite you to a free webinar our company is offering next Thursday, 28 January, at 2PM

Re: Rounding dates on sort and filter

2010-01-20 Thread Lance Norskog
The precision of the date should not matter that much in the time for the first sort. Lucene makes a pair of arrays for the sorted field, one with each unique date and one with each document number in the index. (Yes, the entire index.) The first array will be shorter when you cut the date

Re: Google Commerce Search

2010-01-20 Thread Lance Norskog
The Linux file systems are generally at least twice as fast as the Windows NTFS file system. Solr installations are mostly disk-limited so this will have a major effect. On Tue, Jan 19, 2010 at 12:53 PM, wojtekpia wojte...@hotmail.com wrote: While Solr is functionally platform independent, I

Replication Handler Severe Error: Unable to move index file

2010-01-20 Thread Trey
Does anyone know what would cause the following error?: 10:45:10 AM org.apache.solr.handler.SnapPuller copyAFile SEVERE: *Unable to move index file* from: /home/solr/cores/core8/index.20100119103919/_6qv.fnm to: /home/solr/cores/core8/index/_6qv.fnm This occurred a few days back and we

RE : Need help : Solr configuration issue for sortin g on title field

2010-01-20 Thread EL KASMI Hicham
Sorry Chris and others, it's my first time I'm using a mailing list to ask a question. I'll send my question again in a new blank clean message. Thanks for references. Hicham Message d'origine De: Chris Hostetter [mailto:hossman_luc...@fucit.org] Date: jeu. 21/01/2010 0:12 À:

Re: Replication Handler Severe Error: Unable to move index file

2010-01-20 Thread Otis Gospodnetic
It's hard to tell without poking around, but one of the first things I'd do would be to look for /home/solr/cores/core8/index.20100119103919/_6qv.fnm - does this file/dir really exist? Or, rather, did it exist when the error happened. I'm not looking at the source code now, but is that really

Re: solr blocking on commit

2010-01-20 Thread Steve Conover
How solr organized so that search can continue when a commit has closed the index? Also, looking at lucene docs, commit causes a system fsync().  Won't search also get blocked by the IO traffic generated? ...I'll run iostat too and see if there's anything interesting to report

Re: filter query granularity

2010-01-20 Thread Wangsheng Mei
Thanks for your explanation, it makes a lot sense to me. 2010/1/21 Lance Norskog goks...@gmail.com The docset for fq=bla:A OR bla:B has no relation to the other two. Different 'fq' filters are made and cached separately. The first time you search with a filter query, Solr does that query and

Re: Fastest way to use solrj

2010-01-20 Thread Tim Terlegård
Yes, it worked! Thank you very much. But do I need to use curl or can I use CommonsHttpSolrServer or StreamingUpdateSolrServer? If I can't use BinaryWriter then I don't know how to do this. /Tim 2010/1/20 Noble Paul നോബിള്‍ नोब्ळ् noble.p...@corp.aol.com: 2010/1/20 Tim Terlegård

question

2010-01-20 Thread Daniel Angelov
Is it posible to set maximum indexed documents in solr? For example, I want to insert in solr max 5000 document, after that solr must refuse unserting.