Hi
I am trying to index 12MM docs faster than is currently happening in Solr
(using solrj). We have identified solr's add method as the bottleneck (and not
commit - which is tuned ok through mergeFactor and maxRamBufferSize and jvm
ram).
Adding 1000 docs is taking approximately 25 seconds.
://wiki.apache.org/lucene-java/ImproveIndexingSpeed
But throw a profiler at the indexer as a first step, just to see
where the problem is, CPU or I/O.
Best
Erick
On Sat, Mar 10, 2012 at 4:09 PM, Peyman Faratin pey...@robustlinks.com
wrote:
Hi
I am trying to index 12MM docs faster than is currently
Hi
A noobie question. I am uncertain what is the best way to design for my
requirement which the following.
I want to allow another client in solrj to query solr with a query that is
handled with a custom handler
localhost:9090/solr/tokenSearch?tokens{!dismax
Hi
What is the best way to retrieve the score of a query across ALL documents in
the index? i.e.
given:
1) docs, [A,B,C,D,E,...M] of M dimensions
2) Query q
searcher outputs (efficiently)
1) the score of q across _all_ M dimensional documents, ordered by index
number. i.e
score(q) =
Hi
I have a problem with the following context.
I have a field with a custom type of shingledcontent, defined as follows in
the schema.xml
field name=shingledContent
type=shingledcontent
compressed=true
Hi
Has there been any work that tries to integrate Kernel methods [1] with SOLR? I
am interested in using kernel methods to solve synonym, hyponym and polysemous
(disambiguation) problems which SOLR's Vector space model (bag of words) does
not capture.
For example, imagine we have only 3
Hi
I would to run a mlt search (in Solrj) of a short piece of text delivered via
the stream.body. This part works. What I would like to be able to do is to do 2
things:
- faceting on some number (not ALL) of the results
- cluster (using carrot2) all of the results
Is this possible? I believe
Hi
I have the following 2 field types
fieldType name=tokenizer1 class=solr.TextField sortMissingLast=true
autoGeneratePhraseQueries=true
analyzer type=index
tokenizer class=solr.WhitespaceTokenizerFactory/
filter class=solr.LowerCaseFilterFactory/
/analyzer
thank you Michael.
On Jun 16, 2012, at 6:40 PM, Michael Ryan wrote:
Try changing the tokenizer2 SynonymFilterFactory filter to this:
filter class=solr.SynonymFilterFactory synonyms=synonyms.txt
ignoreCase=false expand=true
tokenizerFactory=solr.KeywordTokenizerFactory/
By default, it
Hi
Is it possible to add a new document to the index in a custom SearchComponent
(that also implements a SolrCoreAware)? I can get a reference to the
indexReader via the ResponseBuilder parameter of the process() method using
rb.req.getSearcher().getReader()
But is it possible to actually add
idea, are there other processes that will
be writing to the index at the
same time?
What's the purpose here anyway? There might be a better approach
Best
Erick
On Thu, Jun 28, 2012 at 4:02 PM, Peyman Faratin pey...@robustlinks.com
wrote:
Hi
Is it possible to add a new document
for this purpose? That is, ask via solrj api
1-2 and perform 3 if entity (assuming you mean document or some field value
by X) didn't exist, i.e. add it to the index.
// Dmitry
On Sun, Jul 1, 2012 at 6:03 AM, Peyman Faratin pey...@robustlinks.comwrote:
Hi Erik
The workflow I'd like
Hi
I have a (23M) synonym file that takes a long time (3 or so minutes) to load
and once included seems to adversely affect the QTime of the application by
approximately 4 orders of magnitude.
Any advise on how to load faster and lower the QT would be much appreciated.
best
Peyman
Hi
Is there a SSD brand and spec that the community recommends for an index of
size 56G with mostly reads? We are evaluating this one
http://www.newegg.com/Product/Product.aspx?Item=N82E16820227706
thank you
Peyman
Hi
If I run a main query cheeze jointly with a boost query bq=spell:cheeze
(boosting results with spell field cheeze), as
/select?fl=titleqf=mainbq=spell:cheezebq=trans:cheezeq=cheeze
everything works fine. And defType=dismax
What I'd like to do is to programmatically generate the bq query
Hi
I am migrating from Lucene 3.6.1 to 4.3.0. I am however not sure how to migrate
my custom collector below. this page
http://lucene.apache.org/core/4_3_0/MIGRATE.html gives some hints but the
instructions are incomplete and looking at the source examples of custom
collectors make me want
Hi
I have a multicore setup (in 4.3.0). Is it possible for one core to share an
instance of its class with other cores at run time? i.e.
At run time core 1 makes an instance of object O_i
core 1 -- object O_i
core 2
---
core n
then can core K access O_i? I know they can share properties but
(instanceDir paths,
logging config maybe?). Why are you trying to do this?
On Sat, Jun 29, 2013 at 1:14 AM, Peyman Faratin pey...@robustlinks.com
wrote:
Hi
I have a multicore setup (in 4.3.0). Is it possible for one core to share an
instance of its class with other cores at run time? i.e
class loaders. or find a place
inside the solr, before the core is created. Google for montysolr to see
the example of the first approach.
But, unless you really have no other choice, using singletons is IMHO a bad
idea in this case
Roman
On 29 Jun 2013 10:18, Peyman Faratin pey
, so there's no
reason a singleton approach wouldn't work that I
can think of. All the multithreaded caveats apply.
Best
Erick
On Fri, Jun 28, 2013 at 3:44 PM, Peyman Faratin pey...@robustlinks.comwrote:
Hi
I have a multicore setup (in 4.3.0). Is it possible for one core to share
Hi
I have subclassed a SearchComponent (call this class S), and would like to
implement the following transaction logic:
1- Client K calls the S's handler
2- S spawns a thread and immediately acks K using
rb.rsp.add(status,complete) then terminates
public void process (ResponseBuilder rb)
could then maintain whatever state it
needs.
-- Jack Krupansky
-Original Message- From: Peyman Faratin
Sent: Saturday, August 17, 2013 12:29 PM
To: solr-user@lucene.apache.org
Subject: State sharing
Hi
I have subclassed a SearchComponent (call this class S), and would like
Hi
Is there a way to build a new (smaller) index from an existing (larger) index
where the smaller index contains a subset of the fields of the larger index?
thank you
On Wed, Sep 4, 2013 at 1:51 PM, Peyman Faratin pey...@robustlinks.comwrote:
Hi
Is there a way to build a new (smaller) index from an existing (larger)
index where the smaller index contains a subset of the fields of the larger
index?
thank you
Hi
I am building a custom UpdateRequestProcessor to intercept any doc heading to
the index. Basically what I want to do is to check if the current index has a
doc with the same title (i am using IDs as the uniques so I can't use that, and
besides the logic of checking is a little more
Hi
Is it possible to delete and commit updates to an index inside a custom
SearchComponent? I know I can do it with solrj but due to several business
logic requirements I need to build the logic inside the search component. I am
using SOLR 4.5.0.
thank you
On Dec 3, 2013, at 8:41 PM, Upayavira u...@odoko.co.uk wrote:
On Tue, Dec 3, 2013, at 03:22 PM, Peyman Faratin wrote:
Hi
Is it possible to delete and commit updates to an index inside a custom
SearchComponent? I know I can do it with solrj but due to several
business logic
Hi
I've tried all permutations with no results so I thought I write to the group
for help.
I am running commons config
(http://commons.apache.org/proper/commons-configuration/) just fine via maven
and ant but when I try to run the class calling the method
PropertiesConfiguration via a SOLR
Hi
I am trying to port my none solrcloud custom search handler to a solrcloud one.
I have read the WritingDistibutedSearchComponents wiki page and looked at Terms
and Querycomponent codes but the control flow of execution is still fuzzy (even
given the “distributed algorithm” description).
Hi
I am having problems accessing float values in a lucene 5.0 index via the
functionquery.
My setup is as follows
Indexing time
--
Document doc = new Document();
FieldType f = new FieldType();
f.setStored(false);
f.setNumericType(NumericType.FLOAT);
30 matches
Mail list logo