Faster Solr Indexing

2012-03-10 Thread Peyman Faratin
Hi I am trying to index 12MM docs faster than is currently happening in Solr (using solrj). We have identified solr's add method as the bottleneck (and not commit - which is tuned ok through mergeFactor and maxRamBufferSize and jvm ram). Adding 1000 docs is taking approximately 25 seconds.

Re: Faster Solr Indexing

2012-03-19 Thread Peyman Faratin
://wiki.apache.org/lucene-java/ImproveIndexingSpeed But throw a profiler at the indexer as a first step, just to see where the problem is, CPU or I/O. Best Erick On Sat, Mar 10, 2012 at 4:09 PM, Peyman Faratin pey...@robustlinks.com wrote: Hi I am trying to index 12MM docs faster than is currently

QueryHandler

2012-03-26 Thread Peyman Faratin
Hi A noobie question. I am uncertain what is the best way to design for my requirement which the following. I want to allow another client in solrj to query solr with a query that is handled with a custom handler localhost:9090/solr/tokenSearch?tokens{!dismax

query score across ALL docs

2012-03-28 Thread Peyman Faratin
Hi What is the best way to retrieve the score of a query across ALL documents in the index? i.e. given: 1) docs, [A,B,C,D,E,...M] of M dimensions 2) Query q searcher outputs (efficiently) 1) the score of q across _all_ M dimensional documents, ordered by index number. i.e score(q) =

custom field default qf of requestHandler

2012-04-03 Thread Peyman Faratin
Hi I have a problem with the following context. I have a field with a custom type of shingledcontent, defined as follows in the schema.xml field name=shingledContent type=shingledcontent compressed=true

Kernel methods in SOLR

2012-04-23 Thread Peyman Faratin
Hi Has there been any work that tries to integrate Kernel methods [1] with SOLR? I am interested in using kernel methods to solve synonym, hyponym and polysemous (disambiguation) problems which SOLR's Vector space model (bag of words) does not capture. For example, imagine we have only 3

faceting and clustering on MLT via stream.body

2013-02-22 Thread Peyman Faratin
Hi I would to run a mlt search (in Solrj) of a short piece of text delivered via the stream.body. This part works. What I would like to be able to do is to do 2 things: - faceting on some number (not ALL) of the results - cluster (using carrot2) all of the results Is this possible? I believe

KeywordTokenizerFactory with SynonymFilterFactory

2012-06-16 Thread Peyman Faratin
Hi I have the following 2 field types fieldType name=tokenizer1 class=solr.TextField sortMissingLast=true autoGeneratePhraseQueries=true analyzer type=index tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.LowerCaseFilterFactory/ /analyzer

Re: KeywordTokenizerFactory with SynonymFilterFactory

2012-06-16 Thread Peyman Faratin
thank you Michael. On Jun 16, 2012, at 6:40 PM, Michael Ryan wrote: Try changing the tokenizer2 SynonymFilterFactory filter to this: filter class=solr.SynonymFilterFactory synonyms=synonyms.txt ignoreCase=false expand=true tokenizerFactory=solr.KeywordTokenizerFactory/ By default, it

index writer in searchComponent

2012-06-28 Thread Peyman Faratin
Hi Is it possible to add a new document to the index in a custom SearchComponent (that also implements a SolrCoreAware)? I can get a reference to the indexReader via the ResponseBuilder parameter of the process() method using rb.req.getSearcher().getReader() But is it possible to actually add

Re: index writer in searchComponent

2012-06-30 Thread Peyman Faratin
idea, are there other processes that will be writing to the index at the same time? What's the purpose here anyway? There might be a better approach Best Erick On Thu, Jun 28, 2012 at 4:02 PM, Peyman Faratin pey...@robustlinks.com wrote: Hi Is it possible to add a new document

Re: index writer in searchComponent

2012-07-01 Thread Peyman Faratin
for this purpose? That is, ask via solrj api 1-2 and perform 3 if entity (assuming you mean document or some field value by X) didn't exist, i.e. add it to the index. // Dmitry On Sun, Jul 1, 2012 at 6:03 AM, Peyman Faratin pey...@robustlinks.comwrote: Hi Erik The workflow I'd like

synonym file

2012-08-02 Thread Peyman Faratin
Hi I have a (23M) synonym file that takes a long time (3 or so minutes) to load and once included seems to adversely affect the QTime of the application by approximately 4 orders of magnitude. Any advise on how to load faster and lower the QT would be much appreciated. best Peyman

recommended SSD

2012-08-23 Thread Peyman Faratin
Hi Is there a SSD brand and spec that the community recommends for an index of size 56G with mostly reads? We are evaluating this one http://www.newegg.com/Product/Product.aspx?Item=N82E16820227706 thank you Peyman

setting bq in searchcomponent

2013-03-21 Thread Peyman Faratin
Hi If I run a main query cheeze jointly with a boost query bq=spell:cheeze (boosting results with spell field cheeze), as /select?fl=titleqf=mainbq=spell:cheezebq=trans:cheezeq=cheeze everything works fine. And defType=dismax What I'd like to do is to programmatically generate the bq query

Upgrading from 3.6.1 to 4.3.0 and Custom collector

2013-06-17 Thread Peyman Faratin
Hi I am migrating from Lucene 3.6.1 to 4.3.0. I am however not sure how to migrate my custom collector below. this page http://lucene.apache.org/core/4_3_0/MIGRATE.html gives some hints but the instructions are incomplete and looking at the source examples of custom collectors make me want

cores sharing an instance

2013-06-28 Thread Peyman Faratin
Hi I have a multicore setup (in 4.3.0). Is it possible for one core to share an instance of its class with other cores at run time? i.e. At run time core 1 makes an instance of object O_i core 1 -- object O_i core 2 --- core n then can core K access O_i? I know they can share properties but

Re: cores sharing an instance

2013-06-29 Thread Peyman Faratin
(instanceDir paths, logging config maybe?). Why are you trying to do this? On Sat, Jun 29, 2013 at 1:14 AM, Peyman Faratin pey...@robustlinks.com wrote: Hi I have a multicore setup (in 4.3.0). Is it possible for one core to share an instance of its class with other cores at run time? i.e

Re: cores sharing an instance

2013-06-30 Thread Peyman Faratin
class loaders. or find a place inside the solr, before the core is created. Google for montysolr to see the example of the first approach. But, unless you really have no other choice, using singletons is IMHO a bad idea in this case Roman On 29 Jun 2013 10:18, Peyman Faratin pey

Re: cores sharing an instance

2013-06-30 Thread Peyman Faratin
, so there's no reason a singleton approach wouldn't work that I can think of. All the multithreaded caveats apply. Best Erick On Fri, Jun 28, 2013 at 3:44 PM, Peyman Faratin pey...@robustlinks.comwrote: Hi I have a multicore setup (in 4.3.0). Is it possible for one core to share

State sharing

2013-08-17 Thread Peyman Faratin
Hi I have subclassed a SearchComponent (call this class S), and would like to implement the following transaction logic: 1- Client K calls the S's handler 2- S spawns a thread and immediately acks K using rb.rsp.add(status,complete) then terminates public void process (ResponseBuilder rb)

Re: State sharing

2013-08-20 Thread Peyman Faratin
could then maintain whatever state it needs. -- Jack Krupansky -Original Message- From: Peyman Faratin Sent: Saturday, August 17, 2013 12:29 PM To: solr-user@lucene.apache.org Subject: State sharing Hi I have subclassed a SearchComponent (call this class S), and would like

subindex

2013-09-04 Thread Peyman Faratin
Hi Is there a way to build a new (smaller) index from an existing (larger) index where the smaller index contains a subset of the fields of the larger index? thank you

Re: subindex

2013-09-08 Thread Peyman Faratin
On Wed, Sep 4, 2013 at 1:51 PM, Peyman Faratin pey...@robustlinks.comwrote: Hi Is there a way to build a new (smaller) index from an existing (larger) index where the smaller index contains a subset of the fields of the larger index? thank you

deleting a doc inside a custom UpdateRequestProcessor

2013-11-18 Thread Peyman Faratin
Hi I am building a custom UpdateRequestProcessor to intercept any doc heading to the index. Basically what I want to do is to check if the current index has a doc with the same title (i am using IDs as the uniques so I can't use that, and besides the logic of checking is a little more

Deleting and committing inside a SearchComponent

2013-12-03 Thread Peyman Faratin
Hi Is it possible to delete and commit updates to an index inside a custom SearchComponent? I know I can do it with solrj but due to several business logic requirements I need to build the logic inside the search component. I am using SOLR 4.5.0. thank you

Re: Deleting and committing inside a SearchComponent

2013-12-03 Thread Peyman Faratin
On Dec 3, 2013, at 8:41 PM, Upayavira u...@odoko.co.uk wrote: On Tue, Dec 3, 2013, at 03:22 PM, Peyman Faratin wrote: Hi Is it possible to delete and commit updates to an index inside a custom SearchComponent? I know I can do it with solrj but due to several business logic

commons-configuration NoClassDefFoundError: Predicate

2014-07-23 Thread Peyman Faratin
Hi I've tried all permutations with no results so I thought I write to the group for help. I am running commons config (http://commons.apache.org/proper/commons-configuration/) just fine via maven and ant but when I try to run the class calling the method PropertiesConfiguration via a SOLR

custom search component on solrcloud

2015-04-15 Thread Peyman Faratin
Hi I am trying to port my none solrcloud custom search handler to a solrcloud one. I have read the WritingDistibutedSearchComponents wiki page and looked at Terms and Querycomponent codes but the control flow of execution is still fuzzy (even given the “distributed algorithm” description).

FunctionQuery of FloatFieldSource (Lucene 5.0)

2015-07-14 Thread Peyman Faratin
Hi I am having problems accessing float values in a lucene 5.0 index via the functionquery. My setup is as follows Indexing time -- Document doc = new Document(); FieldType f = new FieldType(); f.setStored(false); f.setNumericType(NumericType.FLOAT);