Here is one sample query that I picked up from the log file :
With which client?
paul
Le 2 mai 2012 à 01:29, alx...@aim.com a écrit :
all caching is disabled and I restarted jetty. The same results.
You can have two fields: one which is stripped, and another which
stores the original data. You can use copyField directives and make
the stripped field indexed but not stored, and the original field
stored but not indexed. You only have to upload the file once, and
only store the text once.
If
Ok, thanks Otis
Another question on merging
What is the best way to monitor merging?
Is there something in the log file that I can look for?
It seems like I have to monitor the system resources - read/write IOPS etc..
and work out when a merge happened
It would be great if I can do it by looking
Simply turn off replication during your rebuild-from-scratch. See:
http://wiki.apache.org/solr/SolrReplication#HTTP_API
the disabelreplication command.
The autocommit thing was, I think, in reference to keeping
any replication of a partial-rebuild from being replicated.
Autocommit is usually a
Why do you care? Merging is generally a background process, or are
you doing heavy indexing? In a master/slave setup,
it's usually not really relevant except that (with 3.x), massive merges
may temporarily stop indexing. Is that the problem?
Look at the merge policys, there are configurations
The FieldCache gets populated the first time a given field is referenced as
a facet and then will stay around forever. So, as additional queries get
executed with different facet fields, the number of FieldCache entries will
grow.
If I understand what you have said, theses faceted queries do
We have a fairly large scale system - about 200 million docs and fairly high
indexing activity - about 300k docs per day with peak ingestion rates of about
20 docs per sec. I want to work out what a good mergeFactor setting would be by
testing with different mergeFactor settings. I think the
But again, with a master/slave setup merging should
be relatively benign. And at 200M docs, having a M/S
setup is probably indicated.
Here's a good writeup of mergepolicy
http://juanggrande.wordpress.com/2011/02/07/merge-policy-internals/
If you're indexing and searching on a single machine,
Hi,
When I tried to remove a data from UI (which will in turn hit SOLR), the
whole application got stuck up. When we took the log files of the UI, we
could see that this set of requests did not reach SOLR itself. In the SOLR
log file, we were able to find the following exception occuring at the
Erick, I'll do that. Thank you very much.
Regards,
Jacek
On Tue, May 1, 2012 at 7:19 AM, Erick Erickson erickerick...@gmail.comwrote:
The easiest way is to do that in the app. That is, return the top
10 to the app (by score) then re-order them there. There's nothing
in Solr that I know of
Actually we are not thinking of a M/S setup
We are planning to have x number of shards on N number of servers, each of the
shard handling both indexing and searching
The expected query volume is not that high, so don't think we would need to
replicate to slaves. We think each shard will be able
Greetings Solr folk,
How can I instruct the extract request handler to ignore metadata/headers
etc. when it constructs the content of the document I send to it?
For example, I created an MS Word document containing just the word
SEARCHWORD and nothing else. However, when I ship this doc to my
Optimizing is much less important query-speed wise
than historically, essentially it's not recommended much
any more.
A significant effect of optimize _used_ to be purging
obsolete data (i.e. that from deleted docs) from the
index, but that is now done on merge.
There's no harm in optimizing on
I doubt if SOLR has this capability , given that it is based on a RESTful
architecture, but I wanted to ask in case I'm mistaken.
In lucene, it is easier to gain a direct handle to the collector / scorer
and access all the results as they're collected (as opposed to the SOLR
query call that
In other words, .. as an alternative , what's the most efficient way to gain
access to all of the document ids that match a query
--
View this message in context:
http://lucene.472066.n3.nabble.com/Dumb-question-Streaming-collector-query-results-tp3955175p3955194.html
Sent from the Solr - User
Check to see if you have a CopyField for a wildcard pattern that copies to
meta, which would copy all of the Tika-generated fields to meta.
-- Jack Krupansky
-Original Message-
From: Joseph Hagerty
Sent: Wednesday, May 02, 2012 9:56 AM
To: solr-user@lucene.apache.org
Subject:
I do not. I commented out all of the copyFields provided in the default
schema.xml that ships with 3.5. My schema is rather minimal. Here is my
fields block, if this helps:
fields
field name=cust type=stringindexed=true stored=true
required=true /
field name=assettype=string
Hi :)
I'm starting to use Solr and I'm facing a little problem with dates. My
documents have a date property which is of type 'MMdd'.
To index these dates, I use the following code:
String dateString = 20101230;
SimpleDateFormat sdf = new SimpleDateFormat(MMdd);
Date date =
The trailing Z is required in your input data to be indexed, but the Z is
not actually stored. Your query must have the trailing Z though, unless
you are doing a wildcard or prefix query.
-- Jack Krupansky
-Original Message-
From: G.Long
Sent: Wednesday, May 02, 2012 11:18 AM
To:
I can achieve this by building a query with start and rows = 0, and using
queryResponse.getResults().getNumFound().
Are there any more efficient approaches to this?
Thanks
--
View this message in context:
Oops... I meant to say that Solr doesn't *index* the trailing Z, but it is
stored (the stored value, not the indexed value.) The query must match the
indexed value, not the stored value.
-- Jack Krupansky
-Original Message-
From: Jack Krupansky
Sent: Wednesday, May 02, 2012 11:55 AM
That wasn't right either... the query must have the trailing Z, which Solr
will strip off to match the indexed value which doesn't have the Z. So, my
corrected original statement is:
The trailing Z is required in your input data to be indexed, but the Z is
not actually indexed by Solr (it is
Hi Robert,
On May 1, 2012, at 7:07pm, Robert Muir wrote:
On Tue, May 1, 2012 at 6:48 PM, Ken Krugler kkrugler_li...@transpac.com
wrote:
Hi list,
Does anybody know if the Suggester component is designed to work with shards?
I'm not really sure it is? They would probably have to override
Hello,
I just started using elevation for solr. I am on solr 3.5, running with Drupal
7, Linux.
1. I updated my solrconfig.xml
from
dataDir${solr.data.dir:./solr/data}/dataDir
To
dataDir/usr/local/tomcat2/data/solr/dev_d7/data/dataDir
2. I placed my elevate.xml in my solr's data directory.
I did some testing, and evidently the meta field is treated specially from
the ERH.
I copied the example schema, and added both meta and metax fields and
set fmap.content=metax, and lo and behold only the doc content appears in
metax, but all the doc metadata appears in meta.
Although, I
I did small research with the fairly modest result
https://github.com/m-khl/solr-patches/tree/streaming
you can start exploring it from the trivial test
I use jetty that comes with solr.
I use solr's dedupe
updateRequestProcessorChain name=dedupe
processor class=solr.processor.SignatureUpdateProcessorFactory
bool name=enabledtrue/bool
str name=signatureFieldid/str
bool name=overwriteDupestrue/bool
How interesting! You know, I did at one point consider that perhaps the
fieldname meta may be treated specially, but I talked myself out of it. I
reasoned that a field name in my local schema should have no bearing on how
a plugin such as solr-cell/Tika behaves. I should have tested my
hypothesis;
Hi:
I have been working on an integration project involving Solr 3.5.0 that
dynamically registers cores as needed at run-time, but does not contain any
cores by default. The current solr.xml configuration file is:-
?xml version=1.0 encoding=UTF-8 ?
solr persistent=false sharedLib=lib
cores
: String dateString = 20101230;
: SimpleDateFormat sdf = new SimpleDateFormat(MMdd);
: Date date = sdf.parse(dateString);
: doc.addField(date, date);
:
: In the index, the date 20101230 is saved as 2010-12-29T23:00:00Z ( because
: of GMT).
because of GMT is missleading and vague ... what
On Wed, May 2, 2012 at 12:16 PM, Ken Krugler
kkrugler_li...@transpac.com wrote:
What confuses me is that Suggester says it's based on SpellChecker, which
supposedly does work with shards.
It is based on spellchecker apis, but spellchecker's ranking is based
on simple comparators like string
i've installed tomcat7 and solr 3.6.0 on linux/64
i'm trying to get a single webapp + multicore setup working. my efforts
have gone off the rails :-/ i suspect i've followed too many of the
wrong examples.
i'd appreciate some help/direction getting this working.
so far, i've configured
Hello everbody,
I have a doubt with respect to synonyms in Solr, In our company we are lookink
for one solution to resolve synonyms from database and not from one text file
like SynonymFilterFactory do it.
The idea is save all the synonyms in the database, indexing and they will be
ready to
I'm not sure I completely follow, but are you simply saying that you want to
have a synonym filter that reads the synonym table from a database rather
than the current text file? If so, sure, you could develop a replacement for
the current synonym filter which loads its table from a database,
Another solution is to write a script to read the database and create the
synonyms.txt file, dump the file to solr and reload the core.
This gives you the custom synonym solution.
-Original Message-
From: Jack Krupansky [mailto:j...@basetechnology.com]
Sent: Wednesday, May 02, 2012
I don't know if this will help but I usually add a dataDir element to
each cores solrconfig.xml to point at a local data folder for the core
like this:
!-- Used to specify an alternate directory to hold all index
data
other than the default ./data under the Solr home.
If
I chronicled exactly what I had to configure to slay this dragon at
http://vinaybalamuru.wordpress.com/2012/04/12/solr4-tomcat-multicor/
Hope that helps
--
View this message in context:
You are missing the pf, pf2, and pf3 request parameters, which says
which fields to do phrase proximity boosting on.
pf boosts using the whole query as a phrase, pf2 boosts bigrams, and
pf3 boost trigrams.
You can use any combination of them, but if you use none of them, ps
appears to be
Thanks for your answers, now I have another cuestions,if I develop the
filter to replacement the current synonym filter,I understand that this
procces would be in time of the indexing because in time of the query search
there are a lot problems knows. if so, how can I do for create my index
file.
Hello Prabhu,
Look at SPM for Solr (URL in sig below). It includes Index Statistics graphs,
and from these graphs you can tell:
* how many docs are in your index
* how many docs are deleted
* size of index on disk
* number of index segments
* number of index files
* maybe something else I'm
Anyone have any clues about this exception? It happened during the
course of normal indexing. This is new to me (we're running solr 3.6 on
tomcat 6/redhat RHEL) and we've been running smoothly for some time now
until this showed up:
Red Hat Enterprise Linux Server release 5.3 (Tikanga)
: How do I search for things that have no value or a specified value?
Things with no value...
(*:* -fieldName:[* TO *])
Things with a specific value...
fieldName:A
Things with no value or a specific value...
(*:* -fieldName:[* TO *]) fieldName:A
...or if you aren't using
Sounds good. OR in the negation of any query that matches any possible
value in a field.
The Solr query parser doc lists the open range as you used:
-field:[* TO *] finds all documents without a value for field
See:
http://wiki.apache.org/solr/SolrQuerySyntax
This also include pure
Oops... that is:
(-fname:*) OR fname:(A B C)
or
(-fname:[* TO *]) OR fname:(A B C)
-- Jack Krupansky
-Original Message-
From: Jack Krupansky
Sent: Wednesday, May 02, 2012 7:48 PM
To: solr-user@lucene.apache.org
Subject: Re: syntax for negative query OR something
Sounds good. OR
Hmmm... I thought that worked in edismax. And I thought that pure negative
queries were allowed in SolrQueryParser. Oh well.
In any case, in the Lucene or Solr query parser, add *:* to select all
docs before negating the docs that have any value in the field:
(*:* -fname:*) OR fname:(A B C)
There are lots of different strategies for dealing with synonyms, depending
on what exactly is most important and what exactly your are willing to
tolerate.
In your latest example, you seem to be using string fields, which is
somewhat different form the text synonyms we talk about in Solr.
(12/05/03 1:39), Noordeen, Roxy wrote:
Hello,
I just started using elevation for solr. I am on solr 3.5, running with Drupal
7, Linux.
1. I updated my solrconfig.xml
from
dataDir${solr.data.dir:./solr/data}/dataDir
To
dataDir/usr/local/tomcat2/data/solr/dev_d7/data/dataDir
2. I placed my
I think regular sync of database table with synonym text file seems to be
simplest of the solutions. It will allow you to use Solr natively without
any customization and it is not very complicated operation to update
synonyms file with entries in database.
thanks!
On Wed, May 2, 2012 at 4:43 PM, Chris Hostetter
hossman_luc...@fucit.org wrote:
: How do I search for things that have no value or a specified value?
Things with no value...
(*:* -fieldName:[* TO *])
Things with a specific value...
fieldName:A
Things with no value
Jack,
Yes, the queries work fine till I hit the OOM. The fields that start with
S_* are strings, F_* are floats, I_* are ints and so so. The dynamic field
definitions from schema.xml :
dynamicField name=S_* type=stringindexed=true stored=true
omitNorms=true/
dynamicField name=I_*
51 matches
Mail list logo