SOLR deduplication

2011-01-26 Thread Jason Brown
Hi - I have the SOLR deduplication configured and working well. Is there any way I can tell which documents have been not added to the index as a result of the deduplication rejecting subsequent identical documents? Many Thanks Jason Brown. If you wish to view the St. James's Place email

Re: SOLR deduplication

2011-01-26 Thread Markus Jelsma
Not right now: https://issues.apache.org/jira/browse/SOLR-1909 Hi - I have the SOLR deduplication configured and working well. Is there any way I can tell which documents have been not added to the index as a result of the deduplication rejecting subsequent identical documents? Many

Re: Use terracotta bigmemory for solr-caches

2011-01-26 Thread Martin Grotzke
On Tue, Jan 25, 2011 at 4:19 PM, Em mailformailingli...@yahoo.de wrote: Hi Martin, are you sure that your GC is well tuned? This are the heap related jvm configurations for the servers running with 17GB heap size (one with parallel collector, one with CMS): -XX:+HeapDumpOnOutOfMemoryError

Re: Weird behaviour with phrase queries

2011-01-26 Thread Jerome Renard
Hi Erick, On Tue, Jan 25, 2011 at 1:38 PM, Erick Erickson erickerick...@gmail.comwrote: Frankly, this puzzles me. It *looks* like it should be OK. One warning, the analysis page sometimes is a bit misleading, so beware of that. But the output of your queries make it look like the query is

Display analyzed values in hitlist

2011-01-26 Thread Martin Rödig
Hi, I want to display author names in the hitlist. For the author metafield I created a new Fieldtype type_author wich includes a synonymlist. In the Synonymlist all posible names of a person are reduced to one name. Example: d...@hh.demailto:d...@hh.de, dietmar, brock, dietmar brock, db =

Display analyzed values in hitlist

2011-01-26 Thread Martin Rödig
Hi, I want to display author names in the hitlist. For the author metafield I created a new Fieldtype type_author wich includes a synonymlist. In the Synonymlist all posible names of a person are reduced to one name. Example: d...@hh.demailto:d...@hh.de, dietmar, brock, dietmar brock, db =

Re: please help Problem with dataImportHandler

2011-01-26 Thread Ezequiel Calderara
And the answer there didn't help? Why do not copy the logs of this new error too? Every time you encounter an error, take time to send the log output, and if its needed the schema.xml or the solrconfig.xml Thanks On Tue, Jan 25, 2011 at 6:44 AM, Dinesh mdineshkuma...@karunya.edu.inwrote:

Re: Highlighting with/without Term Vectors

2011-01-26 Thread Grant Ingersoll
On Jan 24, 2011, at 2:42 PM, Salman Akram wrote: Hi, Does anyone have any benchmarks how much highlighting speeds up with Term Vectors (compared to without it)? e.g. if highlighting on 20 documents take 1 sec with Term Vectors any idea how long it will take without them? I need to know

Re: How to Configure Solr to pick my lucene custom filter

2011-01-26 Thread Valiveti
I am new to using Solr and lucene. I wrote a custom filter. The logic is build based on a multi field value of the document found. Only the documents that the user has read access should be returned back. I would like this custom filter to be used during search and filter out the documnets.

Re: How to edit / compile the SOLR source code

2011-01-26 Thread Anurag
Actually i also want to edit Source Files of Solr.Does that mean i will have to go in Src directory of Solr and then rebuild using ant? I need not compile them or Ant will do the whole compiling as well as updating the jar files? i have the following files in Solr-1.3.0 directory

SolrDocumentList Size vs NumFound

2011-01-26 Thread Bing Li
Dear all, I got a weird problem. The number of searched documents is much more than 10. However, the size of SolrDocumentList is 10 and the getNumFound() is the exact count of results. When I need to iterate the results as follows, only 10 are displayed. How to get the rest ones?

Re: SolrDocumentList Size vs NumFound

2011-01-26 Thread Markus Jelsma
Hi, If your query yields 1000 documents and the rows parameter is 10 then you'll get only 10 documents. Consult the wiki on the start and rows parameters: http://wiki.apache.org/solr/CommonQueryParameters Cheers. Dear all, I got a weird problem. The number of searched documents is much

Re: How to Configure Solr to pick my lucene custom filter

2011-01-26 Thread Erick Erickson
Ah, ok. We were talking about different things. Filters is kind of overloaded in Solr/lucene, it's easy to be confused. No, you do not have to deal with analyzers or tokenfilters in your scenario. But let's back up a bit here. How are permissions for documents stored? Because if there's an

Re: How to edit / compile the SOLR source code

2011-01-26 Thread Erick Erickson
Sure, at the top level (above src) you should be able to just type ant dist, then look in the dist directory ant there should be a solrversion.war Best Erick On Wed, Jan 26, 2011 at 11:43 AM, Anurag anurag.it.jo...@gmail.com wrote: Actually i also want to edit Source Files of Solr.Does that

Re: How to edit / compile the SOLR source code

2011-01-26 Thread Jonathan Rochkind
[Btw, this is great, thank you so much to Solr devs for providing simple ant-based compilation, and not making me install specific development tools and/or figure out how to use maven to compile, like certain other java projects. Just make sure ant is installed and 'ant dist', I can do that!

Re: How to Configure Solr to pick my lucene custom filter

2011-01-26 Thread Valiveti
We thought of using fq. But that seems not to suit our scenario. Both denial and Grant access permissions are stored on the documnet as rules. The order of the rules also need to be considered. We might have a huge list of values for the ACL field. Each value is considered to be a rule.

Re: How to edit / compile the SOLR source code

2011-01-26 Thread Erick Erickson
Jonathan: If you're working off trunk (and 3x), btw, there's a *great* addition especially if you use IntelliJ (I haven't personally worked with the Eclipse, there's a target for that too). Just get the source. Go to the top level (e.g. apache-trunk). Execute ant idea. Open IntelliJ and point it

How to group result when search on multiple fields

2011-01-26 Thread cyang2010
Let me give an example to illustrate my question: On netflix site, the search box allow you to search by movie, tv shows, actors, directors, and genres. If Tomcat is searched, it gives result as: move titles with Tomcat or whatever, and somewhere in between , it also show two actors, Tom

Re: How to group result when search on multiple fields

2011-01-26 Thread Dennis Gearon
Thsi is probably either 'shingling' or 'facets'. Someone more experienced can verify that or add more details. Dennis Gearon Signature Warning It is always a good idea to learn from your own mistakes. It is usually a better idea to learn from others’ mistakes, so you do not

Re: How to group result when search on multiple fields

2011-01-26 Thread cyang2010
Since it is a search applying for all fields, and the only result that require grouping is people (actors/directors), i am guessing this: 1. The search still queries single index. 2. there are two searches underlying. One for matching movie/tv name, genres name. The other one for top two

Re: How to group result when search on multiple fields

2011-01-26 Thread Markus Jelsma
http://wiki.apache.org/solr/ClusteringComponent http://wiki.apache.org/solr/FieldCollapsing

Re: How to group result when search on multiple fields

2011-01-26 Thread cyang2010
By taking a quick look, that field collapsing seem to be what i want. I am not sure what clusteringcomponent is still. I will look into more. Is Field Collapsing a new feature for solr 4.0 (not yet released yet)? If so, i will have to wait for it. Thanks for point it out! -- View this

Re: Delta Import occasionally missing records.

2011-01-26 Thread Lance Norskog
The SolrEntityProcessor would be a top-level entity. You would do a query like this: sort=timestamp,descrows=1fl=timestamp. This gives you one data item: the timestamp of the last item added to the index. With this, the JDBC sub-entity would create a query that chooses all rows with a timestamp =

Does solr supports indexing of files other than UTF-8

2011-01-26 Thread prasad deshpande
Hello, I am able to successfully index/search non-Engilsh data(like Hebrew, Japnese) which was encoded in UTF-8. However, When I tried to index data which was encoded in local encoding like Big5 for Japanese I could not see the desired results. The contents after indexing looked garbled for Big5

configure httpclient to access solr with user credential on third party host

2011-01-26 Thread Darniz
Hello, i uploaded solr.war file on my hosting provider and added security constraint in web.xml file on my solr war so that only specific user with a certain role can issue get and post request. When i open browser and type www.maydomainname.com/solr i get a dialog box to enter userid and

A Maven archetype that helps packaging Solr as a standalone application embedded in Apache Tomcat

2011-01-26 Thread Simone Tripodi
Hi all guys, this short mail just to make the Maven/Solr communities aware that we published an Apache Maven archetype[1] (that we lazily called 'solr-packager' :P) that helps Apache Solr developers creating complete standalone Solr-based applications, embedded in Apache Tomcat, with few