Re: Which QueryParser to use

2011-01-20 Thread Ahmet Arslan
We construct our query by Lucene API before, as BooleanQuery, TermQuery those kind of things. Okey, it seems that your field are not analyzed and you don't do any analysis while construction of your query by Lucene API. Correct? Then you can use your existing Java code directly inside a

Re: [POLL] Where do you get Lucene/Solr from? Maven? ASF Mirrors?

2011-01-20 Thread Jürgen Jakobitsch
Where do you get your Lucene/Solr downloads from? [] ASF Mirrors (linked in our release announcements or via the Lucene website) [X] Maven repository (whether you use Maven, Ant+Ivy, Buildr, etc.) [X] I/we build them from source via an SVN/Git checkout. [] Other (someone in your company

Re: Which QueryParser to use

2011-01-20 Thread kun xiong
Thanks a lot for your reply.That was very helpful. We construct our lucene query after certain analysis(ex : words segmentation, category identification). Do you mean we plugin those analysis logic and query construction part onto solr, and solr takes the very beginning input. Kun 2011/1/20

Re: How to keep a maintained index with crawled data

2011-01-20 Thread Erlend Garåsen
Thanks Jack! I will give it a try, even though I finally have a Nutch configuration that does exactly what I want it to do (except keeping an eye on updated and deleted documents). Erlend On 19.01.11 16.52, Jack Krupansky wrote: Take a look at Apache ManifoldCF (incubating, close to 0.1

Re: Highlighting approach.

2011-01-20 Thread Stefan Matheis
Hasnain, there is no need for any _additional_ looping? Of course, you have to loop over initially, to get the results - but this should be enough. use result/doc/[@name=id] to check if lst[@name=highlighting]/lst[@name=$id] exists, and if so .. replace the original content w/ the highlighted

Re: using dismax

2011-01-20 Thread Markus Jelsma
Did i write wt? Oh dear. The q and w are too close =) Markus, Its not wt its qt, wt for response type, Also qt is not for Query Parser its for Request Handler ,In solrconfig.xml there are many Request Handlers can be Defined using dismax Query Parser Or Using lucene Query Parser. If you

Re: Local param tag voodoo ?

2011-01-20 Thread Xavier SCHEPLER
Ok, I tryed to use nested queries this way: wt=jsonindent=truefl=qFRq=sarkozy _query_:{!tag=test}chiracfacet=truefacet.field={!ex=test}studyDescriptionId It resulted in this error: facet_counts:{ facet_queries:{}, exception:java.lang.NullPointerException\n\tat

Multicore Relaod Theoretical Question

2011-01-20 Thread Em
Hello list, I got a theoretical question about a Multicore-Situation: I got two cores: active, inactive The active core serves all the queries. The inactive core is the tricky thing: I create an optimized index outside the environment and want to insert that optimized index 1 to 1 into the

Re: Adding metadata to a Solr schema

2011-01-20 Thread Em
Hi David, if your transaction id will be handled at document level, you can just add a field in your schema named transaction_id - that's it. All you have to do is to insert that transaction_id everytime you do an update (Solr does not generate an transaction_id by default and I don't know of

Problem with replication

2011-01-20 Thread Thomas Kellerer
Hi all, we have implemented a Solr based search in our web application. We have one master server that maintains the index which is replicated to the slaves using the built-in Solr replication. This has been working fine so far, but suddenly the replication does not send the modified files

Multicore Search Map size must not be negative

2011-01-20 Thread Jörg Agatz
Hallo.. I have create multicore search and will search in more then one Core! Now i have done: http://192.168.105.59:8080/solr/mail/select?wt=phpsq=*:*shards=192.168.105.59:8080/solr/mail,192.168.105.59:8080/solr/mail11 But Error... HTTP Status 500 - Map size must not be negative

Re: Multicore Search Map size must not be negative

2011-01-20 Thread Markus Jelsma
That looks like this issue: https://issues.apache.org/jira/browse/SOLR-2278 On Thursday 20 January 2011 13:02:41 Jörg Agatz wrote: Hallo.. I have create multicore search and will search in more then one Core! Now i have done:

Re: Local param tag voodoo ?

2011-01-20 Thread Xavier SCHEPLER
Since it seems to be no voodoo available I did it on the client side. I send a first request to get the facets and a second to get the documents and their highlighting. It works well but requires more processing. From: Xavier SCHEPLER

Re: Problem with replication

2011-01-20 Thread Stevo Slavić
On which events did you configure master to perform replication? replicateAfter Regards, Stevo. On Thu, Jan 20, 2011 at 12:53 PM, Thomas Kellerer spam_ea...@gmx.net wrote: Hi all, we have implemented a Solr based search in our web application. We have one master server that maintains the

Re: Mem allocation - SOLR vs OS

2011-01-20 Thread Salman Akram
I will be looking into JConsole. One more question regarding caching. When we talk about warm-up queries does that mean that some of the complex queries (esp those which require high I/O e.g. phrase queries) will really be very slow (on lets say an index of 200GB) if they are not cached? I am

Re: Problem with replication

2011-01-20 Thread Thomas Kellerer
Here is our configuration: lst name=master str name=enabletrue/str str name=replicateAftercommit/str str name=replicateAfterstartup/str str name=confFilesstopwords.txt,stopwords_de.txt,stopwords_en.txt,synonyms.txt/str /lst Stevo Slavić, 20.01.2011 13:26: On which events did you

Re: Single value vs multi value setting in tokenized field

2011-01-20 Thread kenf_nc
Thanks guys. I read (and actually digested this time) how multivalued fields work and now realize my question came from a 'structured language/dbms' background. The multivalued field is stored basically as a single value with extra spacing between terms (the positionIncrementGap previously

Re: solrconfig.xml settings question

2011-01-20 Thread kenf_nc
Is that it? Of all the strange, esoteric, little understood configuration settings available in solrconfig.xml, the only thing that affects Index Speed vs Query Speed is turning on/off the Query Cache and RamBufferSize? And for the latter, why wouldn't RamBufferSize be the same for both...that

Re: Problem with replication

2011-01-20 Thread Thomas Kellerer
Thomas Kellerer, 20.01.2011 12:53: Hi all, we have implemented a Solr based search in our web application. We have one master server that maintains the index which is replicated to the slaves using the built-in Solr replication. This has been working fine so far, but suddenly the replication

Re: Problem with replication

2011-01-20 Thread Stevo Slavić
So if on startup index gets replicated, then commit probably isn't being called anywhere on master. Is that index configured to autocommit on master, or do you commit from application code? If you commit from application code, check if commit actually gets issued to the slave. Regards, Stevo.

Re: Problem with replication

2011-01-20 Thread Thomas Kellerer
Stevo Slavić, 20.01.2011 15:42: So if on startup index gets replicated, then commit probably isn't being called anywhere on master. No, the index is not replicated on startup (same behaviour: no files to download) Is that index configured to autocommit on master, or do you commit from

Re: Which QueryParser to use

2011-01-20 Thread Jonathan Rochkind
On 1/20/2011 1:42 AM, kun xiong wrote: Thar example string means our query is BooleanQuery containing BooleanQuerys. I am wondering how to write a complicated BooleanQuery for dismax, like (A or B or C) and (D or E) Or I have to use Lucene query parser. You can't do it with dismax. You might

Re: Local param tag voodoo ?

2011-01-20 Thread Yonik Seeley
On Thu, Jan 20, 2011 at 4:59 AM, Xavier SCHEPLER xavier.schep...@sciences-po.fr wrote: Ok, I tryed to use nested queries this way: wt=jsonindent=truefl=qFRq=sarkozy _query_:{!tag=test}chiracfacet=truefacet.field={!ex=test}studyDescriptionId It resulted in this error: facet_counts:{  

Search for social networking sites

2011-01-20 Thread sivaprasad
Hi, I am building a social networking site.For searching profiles, i am trying to implement solr. But here i am facing a problem.As a social networking site, the data base is going to get more updates/inserts frequently.That means,the search is going to be in real time.How can we achieve this

Showing facet values in alphabetical order

2011-01-20 Thread PeterKerk
I want to provide a list of facets to my visitors order alphabetically, for example, for the 'features' facet I have: data-config.xml: entity name=location_feature query=select featureid from location_features where locationid='${location.id}' entity name=feature query=select title from

Adding weightage to the facets count

2011-01-20 Thread sivaprasad
Hi, I am building tag cloud for products by using facets.I made tag names as facets and i am taking facets count as reference to display tag cloud.Each product has tags with their own weightage.Let us say, For example prod1 has tag called “Light Weight” with weightage 20, prod2 has tag called

Re: Which QueryParser to use

2011-01-20 Thread Ahmet Arslan
We construct our lucene query after certain analysis(ex : words segmentation, category identification). By analysis, I referring charfilter(s)+tokenizer+tokenfilter(s) combination. Do you mean we plugin those analysis logic and query construction part onto solr, and solr takes the very

Re: Mem allocation - SOLR vs OS

2011-01-20 Thread Otis Gospodnetic
Salman, Yeah, that first cache is too small. Double it and evictions may go away. Re warm-up queries - yes, they'll be slower than when executed later when they get pulled out of the cache. Plus if these are warm-up queries, the relevant parts of the index may not be in the OS buffer cache.

Re: Showing facet values in alphabetical order

2011-01-20 Thread Ahmet Arslan
But this doesnt give me the facets in an alphabetical order. Besides the features facet, I also have some other facets that ALSO need to be shown in alphabetical order. How to approach this? facet.sort=false http://wiki.apache.org/solr/SimpleFacetParameters#facet.sort

Re: [POLL] Where do you get Lucene/Solr from? Maven? ASF Mirrors?

2011-01-20 Thread Tomás Fernández Löbbe
On Tue, Jan 18, 2011 at 6:04 PM, Grant Ingersoll gsing...@apache.orgwrote: As devs of Lucene/Solr, due to the way ASF mirrors, etc. works, we really don't have a good sense of how people get Lucene and Solr for use in their application. Because of this, there has been some talk of dropping

WARNING: re-index all trunk indices

2011-01-20 Thread Michael McCandless
If you are using Lucene's trunk (to be 4.0) builds, read on... I just committed LUCENE-2872, which is a hard break on the index file format. If you are living on Lucene's trunk then you have to remove any previously created indices and re-index, after updating. The change cuts over to a faster

Indexing same data in multiple fields with different filters

2011-01-20 Thread shm
Hi, I have a little problem regarding indexing, that i don't know how to solve, i need to index the same data in different ways into the same field. The problem is a normalization problem, and here is an example: I have a special character \uA732, which i need to normalize in two different ways

Re: Showing facet values in alphabetical order

2011-01-20 Thread Jonathan Rochkind
Are you showing the facets with facet parameters in your request? Then you can ask for the facets to be returned sorted by byte-order with facet.sort=index. Got nothing to do with your schema, let alone your DIH import configuration that you showed us. Just a matter of how you ask Solr for

Re: Adding weightage to the facets count

2011-01-20 Thread Jonathan Rochkind
Maybe?: Just keep the 'weightages' in an external store of some kind (rdbms, nosql like mongodb, just a straight text config file that your app loads into a hash internally, whatever), rather than Solr, and have your app look them up for each facet value to be displayed, after your app fetches

Document level security

2011-01-20 Thread Rok Rejc
Hi all, I have an index containing a couple of million documents. Documents are grouped into groups, each group contains from 1000-2 documents. The problem: Each group has defined permission settings. It can be viewed by public, viewed by registred users, or viewed by a list of users (each

Indexing all permutations of words from the input

2011-01-20 Thread Martin Jansen
Hey there, I'm looking for an analyzer configuration for Solr 1.4 that accomplishes the following: Given the input abc xyz foo I would like to add at least the following token combinations to the index: abc abc xyz abc xyz foo abc foo xyz xyz foo

Re: Indexing all permutations of words from the input

2011-01-20 Thread Jonathan Rochkind
Why do you want to do this, what is it meant to accomplish? There might be a better way to accomplish what it is you are trying to do; I can't think of anything (which doesn't mean it doesn't exist) that what you're actually trying to do would be required in order to do. What sorts of

Opensearch Format Support

2011-01-20 Thread Tod
Does Solr support the Opensearch format? If so could someone point me to the correct documentation? Thanks - Tod

Re: Indexing all permutations of words from the input

2011-01-20 Thread Martin Jansen
On 20.01.11 22:19, Jonathan Rochkind wrote: On 1/20/2011 4:03 PM, Martin Jansen wrote: I'm looking for ananalyzer configuration for Solr 1.4 that accomplishes the following: Given the input abc xyz foo I would like to add at least the following token combinations to the index: abc

Re: Indexing all permutations of words from the input

2011-01-20 Thread Jonathan Rochkind
Aha, I have no idea if there actually is a better way of achieving that, auto-completion with Solr is always tricky and I personally have not been happy with any of the designs I've seen suggested for it. But I'm also not entirely sure your design will actually work, but neither am I sure it

Re: Opensearch Format Support

2011-01-20 Thread Jonathan Rochkind
No, not exactly. In general, people don't expose their Solr API direct to the world -- they front Solr with some software that is exposed to the world. (If you do expose your Solr API directly to the world, you will need to think carefully about security, and make sure you aren't letting

RE: Indexing all permutations of words from the input

2011-01-20 Thread Steven A Rowe
Hi Martin, The co-occurrence filter I'm working on at https://issues.apache.org/jira/browse/LUCENE-2749 would do what you want (among other things). Still vaporware at this point, as I've only put a couple of hours into it, so don't hold your breath :) Steve -Original Message-

Re: Opensearch Format Support

2011-01-20 Thread Grant Ingersoll
You might also see https://issues.apache.org/jira/browse/SOLR-2143 On Jan 20, 2011, at 4:50 PM, Jonathan Rochkind wrote: No, not exactly. In general, people don't expose their Solr API direct to the world -- they front Solr with some software that is exposed to the world. (If you do expose

Re: Document level security

2011-01-20 Thread Peter Sturge
Hi, One of the things about Document Security is that it never involves just one thing. There are a lot of things to consider, and unfortunately, they're generally non-trivial. Deciding how to store/hold/retrieve permissions is certainly one of those things, and you're right, you should avoid

Re: Which QueryParser to use

2011-01-20 Thread kun xiong
Okey, thanks very much. 2011/1/21 Ahmet Arslan iori...@yahoo.com We construct our lucene query after certain analysis(ex : words segmentation, category identification). By analysis, I referring charfilter(s)+tokenizer+tokenfilter(s) combination. Do you mean we plugin those analysis

Re: Document level security

2011-01-20 Thread Dennis Gearon
I'm not sure how you COULD do searching without having the permissions in the documents. I mentally use the model of unix filesystems, as a starter. Simple, but powerful. If I needed a separate table for permissions, or index, I'd have to do queries, with GINORMOUS amounts of OR statements. I

Re: Document level security

2011-01-20 Thread Dennis Gearon
I'm thinking of using something like this: http://www.xaprb.com/blog/2006/08/16/how-to-build-role-based-access-control-in-sql/ http://www.xaprb.com/blog/2006/08/18/role-based-access-control-in-sql-part-2/ - Original Message From: Dennis Gearon gear...@sbcglobal.net To:

Re: Indexing same data in multiple fields with different filters

2011-01-20 Thread Gora Mohanty
On Thu, Jan 20, 2011 at 4:08 PM, shm s...@dbc.dk wrote: Hi, I have a little problem regarding indexing, that i don't know how to solve, i need to index the same data in different ways into the same field. The problem is a normalization problem, and here is an example: I have a special

[Call for Papers] ICSE Software Engineering for Cloud Computing (SECLOUD) Workshop

2011-01-20 Thread Mattmann, Chris A (388J)
(apologies for the cross posting) *** PLEASE NOTE - the deadline for submitting papers has been extended by 1 week to 1/28/2011! *** Please consider submitting a paper to the ICSE 2011 Software Engineering for Cloud Computing (SECLOUD) Workshop to be held Sunday, May 22, 2011, at the Hilton

Solr Optimize Times

2011-01-20 Thread jcanabou
Does anyone know if there is a function in solr that allows us to log optimize times? ie - the length of time optimization takes. I can find alot of questions of how long optimization should take, but thus far nothing on how to access how long a particular run actually took. Thanks!! -- View

Re: DIH with full-import and cleaning still keeps old index

2011-01-20 Thread Bernd Fehling
Looks like this is a bug and I should write a jira issue for it? Regards Bernd Am 20.01.2011 11:30, schrieb Bernd Fehling: Hi list, after sending full-import=trueclean=truecommit=true Solr 4.x (apache-solr-4.0-2010-11-24_09-25-17) responds with: - DataImporter doFullImport -

Indexing FTP Documents through SOLR??

2011-01-20 Thread pankaj bhatt
Hi All, Is there is any way in SOLR or any plug-in through which the folders and documents in FTP location can be indexed. / Pankaj Bhatt.

Re: Document level security

2011-01-20 Thread Grijesh
Hi Rok, I have used about 25 ids with OR Operator and its working fine for me.Just Have to Increase the MaxBoolClouse parameter and also have to configure max header size on Servlet container to enable for big query requests. - Thanx: Grijesh -- View this message in context:

Integrating Surround Query Parser

2011-01-20 Thread Ahson Iqbal
Hi All I want to integrate Surround Query Parser with solr, To do this i have downloaded jar file from the internet and and then pasting that jar file in web-inf/lib and configured query parser in solrconfig.xml as queryParser name=SurroundQParser

Re: Problem with replication

2011-01-20 Thread Thomas Kellerer
We have tried that as well, but the slave still claims to have a higher index version, even when the index files were deleted completely Regards Thomas Stevo Slavić, 20.01.2011 16:52: Not too elegant but valid check would be to bring slave down, delete it's index data directory, then to commit

Re: Indexing FTP Documents through SOLR??

2011-01-20 Thread Gora Mohanty
On Fri, Jan 21, 2011 at 12:21 PM, pankaj bhatt panbh...@gmail.com wrote: Hi All,  Is there is any way in SOLR or any plug-in through which the folders and documents in FTP location can be indexed. [...] What format are these documents in? Which parts of the documents do you want to index? In

Re: pruning search result with search score gradient

2011-01-20 Thread Toke Eskildsen
On Tue, 2011-01-11 at 12:12 +0100, Julien Piquot wrote: I would like to be able to prune my search result by removing the less relevant documents. I'm thinking about using the search score : I use the search scores of the document set (I assume there are sorted by descending order),

Re: Document level security

2011-01-20 Thread Dennis Gearon
Would you do that with 1000's of users? How expensive in processor time is it? Have you ever benchmarked it? Dennis Gearon Signature Warning It is always a good idea to learn from your own mistakes. It is usually a better idea to learn from others’ mistakes, so you do not

Re: pruning search result with search score gradient

2011-01-20 Thread Dennis Gearon
that's a pretty good idea, using 'delta score' Dennis Gearon Signature Warning It is always a good idea to learn from your own mistakes. It is usually a better idea to learn from others’ mistakes, so you do not have to make them yourself. from

Re: Search for social networking sites

2011-01-20 Thread Espen Amble Kolstad
I haven't tried myself, but you could look at solandra : https://github.com/tjake/Lucandra - Espen On Thu, Jan 20, 2011 at 6:30 PM, stockii stock.jo...@googlemail.com wrote: http://wiki.apache.org/solr/NearRealtimeSearchTuning

Re: Integrating Surround Query Parser

2011-01-20 Thread Dennis Gearon
Sounds to me like you either have to find a way NOT to use a parser that is a child class of: org.apache.solr.search.QParserPlugin (not sure if that's possible), or you have to find out what's wrong with the file. Where did you get it, have you talked to the author? Dennis Gearon