Re: Drop documents when indexing with DHI

2011-03-07 Thread Stefan Matheis
Rosa, try http://wiki.apache.org/solr/DataImportHandler#Special_Commands HTH Stefan On Fri, Mar 4, 2011 at 9:44 PM, Rosa (Anuncios) rosaemailanunc...@gmail.com wrote: Hi, Is it possible to skip document when indexing with DHI based on a regex to filter certain badwords for example? Thanks

Re: New PHP API for Solr (Logic Solr API)

2011-03-07 Thread Stefan Matheis
Burak, what's wrong with the existing PHP-Extension (http://php.net/manual/en/book.solr.php)? Regards Stefan On Sun, Mar 6, 2011 at 11:31 PM, Burak burak...@gmail.com wrote: Hello, I have recently finished writing a PHP API for Solr and have released it under the Apache License. The project

Re: New PHP API for Solr (Logic Solr API)

2011-03-07 Thread Lukas Kahwe Smith
On 07.03.2011, at 09:43, Stefan Matheis wrote: Burak, what's wrong with the existing PHP-Extension (http://php.net/manual/en/book.solr.php)? the main issue i see with it is that the API isn't designed much. aka it just exposes lots of features with dedicated methods, but doesnt focus on

Re: Solr Autosuggest help

2011-03-07 Thread Ahmet Arslan
I have added the following line in both the  section and in   section in schema.xml. filter class=solr.ShingleFilterFactory maxShingleSize=2 outputUnigrams=true outputUnigramIfNoNgram=true And reindex my content. However, if I query solr for the multi work search terms suggestion , it

StreamingUpdateSolrServer

2011-03-07 Thread Isan Fulia
Hi all, I am using StreamingUpdateSolrServer with queuesize = 5 and threadcount=4 The no. of connections created are same as threadcount. Is it that it creates a new connection for every thread. -- Thanks Regards, Isan Fulia.

Re: Solr Autosuggest help

2011-03-07 Thread rahul
hi.. thanks for your replies.. It seems I mistakenly put ShingleFilterFactory in another field. When I put the factory in correct field it works fine now. Thanks. -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-Autosuggest-help-tp2580944p2645780.html Sent from the

Re: Multiple Blocked threads on UnInvertedField.getUnInvertedField() SegmentReader$CoreReaders.getTermsReader

2011-03-07 Thread Rachita Choudhary
Hi Yonik, Thanks for the information, but we are still facing issues related to slowness and high memory usage. As per my understanding, the default 'FC' method suits are use case, as we have total about 1.1 million documents and no. of unique values for facet fields is quite high. We facet on 5

Re: Multiple Blocked threads on UnInvertedField.getUnInvertedField() SegmentReader$CoreReaders.getTermsReader

2011-03-07 Thread Yonik Seeley
On Mon, Mar 7, 2011 at 9:44 AM, Rachita Choudhary rachita.choudh...@burrp.com wrote: As enum method , will create a bitset for all the unique values It's more complex than that. - small sets will use a sorted int set... not a bitset - you can control what gets cached via facet.enum.cache.minDf

Re: dismax, and too much qf?

2011-03-07 Thread Jonathan Rochkind
I use about that many qf's in Solr 1.4.1. It works. I'm not entirely sure if it has performance implications -- I do have searching that is somewhat slower then I'd like, but I'm not sure if the lengthy qf is a contributing factor, or other things I'm doing (like a dozen different

Re: New PHP API for Solr (Logic Solr API)

2011-03-07 Thread dan whelan
When are you going to complete the Texis Search API? On 3/6/11 2:31 PM, Burak wrote: Hello, I have recently finished writing a PHP API for Solr and have released it under the Apache License. The project is called Logic Solr API and is located at

Re: dismax, and too much qf?

2011-03-07 Thread Jeff Schmidt
Hi Jonathan: On Mar 7, 2011, at 8:33 AM, Jonathan Rochkind wrote: I use about that many qf's in Solr 1.4.1. It works. I'm not entirely sure if it has performance implications -- I do have searching that is somewhat slower then I'd like, but I'm not sure if the lengthy qf is a contributing

Re: Trying to use FieldReaderDataSource in DIH

2011-03-07 Thread Jeff Schmidt
I can see that XPathEntityProcessor.init() is using the no-arg version of Context.getDataSource(). Since fields are hierarchical, should that not be a request for the the current innermost data source (i.e. fieldSource which is a FieldReaderDataSource)? Or should init() be looking at the

Solr Cell DataImport Tika handler broken - fails to index Zip file contents

2011-03-07 Thread Jayendra Patil
Working with the latest Solr Trunk code and seems the Tika handlers for Solr Cell (ExtractingDocumentLoader.java) and Data Import handler (TikaEntityProcessor.java) fails to index the zip file contents again. It just indexes the file names again. This issue was addressed some time back, late last

Looking for a Lucene/Solr Contractor

2011-03-07 Thread Drew Kutcharian
Hi Everyone, We are looking for someone to help us build a similarity engine. Here are some preliminary specs for the project. 1) We want to be able to show similar posts when a user posts a new block of text. A good example of this is StackOverflow. When a user tries to ask a new question,

Re: Looking for a Lucene/Solr Contractor

2011-03-07 Thread Jan Høydahl
Please check http://wiki.apache.org/solr/Support and http://wiki.apache.org/lucene-java/Support for a list of companies you may contact. -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com On 7. mars 2011, at 19.40, Drew Kutcharian wrote: Hi Everyone, We are looking

How to handle searches across traditional and simplifies Chinese?

2011-03-07 Thread Andy
I have documents that contain both simplified and traditional Chinese characters. Is there any way to search across them? For example, if someone searches for 类 (simplified Chinese), I'd like to be able to recognize that the equivalent character is 類 in traditional Chinese and search for 类 or 類

Re: How to handle searches across traditional and simplifies Chinese?

2011-03-07 Thread François Schiettecatte
I did a little research into this for a client a while. The character mapping is not one to one which complicates things (TC and SC have evolved independently) and if you want to do a perfect job you will need a dictionary. However there are tables out there (I can dig one up for you) that

Re: How to handle searches across traditional and simplifies Chinese?

2011-03-07 Thread Andy
Thanks. Please tell me more about the tables/software that does the conversion. Really appreciate your help. --- On Mon, 3/7/11, François Schiettecatte fschietteca...@gmail.com wrote: From: François Schiettecatte fschietteca...@gmail.com Subject: Re: How to handle searches across traditional

Re: How to handle searches across traditional and simplifies Chinese?

2011-03-07 Thread François Schiettecatte
Here are a bunch of resources which will help: This does TC = SC conversions: http://search.cpan.org/~audreyt/Encode-HanConvert-0.35/lib/Encode/HanConvert.pm This has a TC = SC converter in there somewhere: http://www.mediawiki.org/wiki/MediaWiki This explains some of the

Re: How to handle searches across traditional and simplifies Chinese?

2011-03-07 Thread Robert Muir
On Mon, Mar 7, 2011 at 7:01 PM, Andy angelf...@yahoo.com wrote: Thanks. Please tell me more about the tables/software that does the conversion. Really appreciate your help. also you might be interested in this example: filter class=solr.ICUTransformFilterFactory id=Traditional-Simplified/

logical relation among filter queries

2011-03-07 Thread cyang2010
I wonder what is the logical relation among filter queries. I can't find much documentation on filter query. for example, i want to find all titles that is either PG-13 or R through filter query. The following query won't give me any result back. So I suppose by default it is intersection

Re: logical relation among filter queries

2011-03-07 Thread Jayendra Patil
you can use the boolean operators in the filter query. e.g. fq=rating:(PG-13 OR R) Regards, Jayendra On Mon, Mar 7, 2011 at 9:25 PM, cyang2010 ysxsu...@hotmail.com wrote: I wonder what is the logical relation among filter queries.  I can't find much documentation on filter query. for

Re: New PHP API for Solr (Logic Solr API)

2011-03-07 Thread Burak
On 03/07/2011 12:43 AM, Stefan Matheis wrote: Burak, what's wrong with the existing PHP-Extension (http://php.net/manual/en/book.solr.php)? I think wrong is not the appropriate word here. But if I had to summarize why I wrote this API: * Not everybody is enthusiastic about adding another

Use of multiple tomcat instance and shards.

2011-03-07 Thread rajini maski
In order to increase the Java heap memory, I have only 2gb ram… so my default memory configuration is --JvmMs 128 --JvmMx 512 . I have the single solr data index upto 6gb. Now if I am trying to fire a search very often on this data index, after sometime I find an error as java heap