I am a newbie at solr. I have done everything in the solr tutorial section. I
am using the latest versions of both JDK(1.6.03) and Solr(2.2). I can see the
solr admin page http://localhost:8983/solr/admin/ But when I hit the search
button I receive an http error:
HTTP ERROR: 400
Invalid value
-Orijinal e-posta iletisi-
From: Ryan McKinley [EMAIL PROTECTED]
Date: Tue, 20 Nov 2007 07:16:53 +0200
To: solr-user@lucene.apache.org
Subject: Re: Invalid value 'explicit' for echoParams parameter
AHMET ARSLAN wrote:
I am a newbie at solr. I have done everything in the solr tutorial
I have extracted text from .pdf files and I also
inserted page numbers of the .pdf file to the text. My
document looks something like:
content
page no=2 ..Some Text../page
page no=3 ..Some Text../page
..
.../page
/content
I
but whats with a standard setup? is there a way to do this?
we have not
yet decided how we run our production servers. at the
moment were
developing a enterprise search for our intranet...
Reloading a core is only possible if you are using an installation
with solr.xml (i.e. a multi core
Has anyone implemented a Dismax type
solution that also uses a default
operator (or q.op)?
Dismax ignores default operator.
I'd like to be able to use OR
operators for all the
qf fields but have read that qf=dismax does not support
operators.
Dismax has a mm [1] (Minimum 'Should'
--- On Mon, 4/12/10, Ahmet Arslan iori...@yahoo.com wrote:
From: Ahmet Arslan iori...@yahoo.com
Subject: Re: AW: refreshing synonyms.txt - or other configs
To: solr-user@lucene.apache.org
Date: Monday, April 12, 2010, 5:08 PM
yes i am using solr.xml, although
there is only one core
You could make your own little plugin RequestHandler that
did the reload
though. The RQ could get the CoreContainer from the
SolrCore retrieved
from the request, and then call reload on the core.
Awesome, this piece of code reloaded schema.xml, stopwords.txt etc. Thanx!
public class
Using Solr 1.4 I wanted to construct a query that returns
documents that
have a particular field value or are missing the field. The
first thing I
came up with was:
field1:particularvalue OR -field1:[* TO *]
It turns out the -field1:[* TO *] was being ignored. If
particularvalue
Thanks for reply..
but how will I get the stored value instead of indexed
value..
where I need to configure to get stored instead of indexed
value.
please help...
You need to remove html tags before analysis (charfilter, tokenizer,
tokenfilter) phase. For example if you are using DIH
Actually I am using SolrJ client..
Is there anyway to do same using solrj.
thanks
If you are using Java, life is easier. You can use this static function before
adding a field to SolrInputDocument.
static String stripHTMLX(String value) {
StringBuilder out = new StringBuilder();
I am facing problem to get facet result count. I must be
wrong somewhere.
I am getting proper result count when searching by single
word, but when
searching by string then result count become wrong.
for example : -
search keyword : Bagdad bomb blast.
I am getting 5 result count for facet
I am rather new to Solr and have a question.
We have around 200.000 txt files which are placed into the
file cloud.
The file path is something similar to this:
file/97/8f/840/fa4-1.txt
file/a6/9d/ab0/ca2-2.txt etc.
and we also store the metadata (like title, description,
tags etc)
I didnt find any about my problem...
how can i replace an ampersand in indextime ?
my autosuggest words are haveing ampersands. how can i
replace this sign ()
???
Easiest way is to use MappingCharFilterFactory before your tokenizer.
charFilter class=solr.MappingCharFilterFactory
I need to perform wildcard search in phrase query. I have 2
documents
containing text how do impair and how to improve. I
want to be able to
search both documents by searching (how to im*). There is a
provision in
lucene which allows me to perform this operation using
SpanWildcardQuery and
I'm setting up my Solr index to be
updated every x minutes.
Does Solr cache the result of a search, and then when next
time the same search is requested, it'd recognize that the
Index has not changed and therefore just return the previous
result from cache without processing the search
we want to index and search in our intranet documents.
the field body contains html-tags.
in our schema.xml we have a fieldType text_de (see at the
end of this mail) which uses charFilter
solr.HTMLStripCharFilterFactory with index.
so this is no problem. the text is put into the index
Hi everybody:
I have a big problem with solr in a server with the memory
size it is using,
I am setting up Solr with java -jar start.jar command in
an ubuntu server,
the process start.jar is using 7Gb of memory in the
server and it is
affecting considerably the performance of the
And what is the recommended max size
memory I should use ??? Is there anyone
recommended ???
What is your index size?
I tried that and got the following result. Do I have to do
anything other
than the mentioned instructions to make it work?
HTTP ERROR: 500
tried to access field
org.apache.lucene.queryParser.QueryParser.field from
class
I'm quite new to solr 1.4. I have a requirement to be able
to search partial
words (sun hot = Sunway Hotel) and to search full
word(sunway hotel
= Sunway Hotel). Currently, I could be able to search
only full word.
Anyone has any suggestions?
Looks like a PrefixQuery. sun* hot* will
I have a field configured as text type (default text type -
with stemming enabled on both index and query time):
field name=MyTitle type=text indexed=true
stored=true /
When I try to sort on this field, it is throwing the
exception:
HTTP Status 500 - there are more terms than
I used eclipse-jee-galileo-SR2-win32 to build the ant and
selected dist-war
for execution in build. I got the following message.
I use command prompt to invoke ant so I am not sure about this.
The solr performed as usual and when I tried adding
defType=complexphrase
to search url the
I use the query {NOW -1DAY to NOW} on
a date field, an it works just fine.
I am however also interested in the actual value that the
server substituted for NOW, how can I have that returned in
the query response?
you can see str name=parsedquery in the response if you append
I used command line to build ant this time.
Before calling 'ant dist' where did you copy the ComplexPhrase-1.0.jar?
apache-solr-1.4.0\lib or apache-solr-1.4.0\example\lib?
Yes I ran solr using java -jar start.jar. I did the above
mentioned tasks
but the results were the same.
can you
I tried it by placing ComplexPhrase-1.0.jar in
apache-solr-1.4.0\lib ;
apache-solr-1.4.0\example\lib ; and
apache-solr-1.4.0\example\solr\lib with
the same error
You need to copy it to only apache-solr-1.4.0\lib
Maybe it is better to get a fresh copy of apache-solr-1.4.0.zip and continue
i got a new problem. we put out item's table into antother
database and now
i need to use multiple datascource but without successed
=(
so.. here my data-config-xml in short ;)
dataSource name=shops type=JdbcDataSource ...
/
entity name=active pk=id dataSource=shops
query=select id
Multiple spellcheckers may be
specified by name in solrconfig, such as
str name=namejarowinkler/str, however
how does one make a
request to this particular spellchecker, as opposed to the
one named
default?
With spellcheck.dictionary parameter.
I am currently having serious performance problems with
date range queries. What I am doing, is validating a
datasets published status by a valid_from and a valid_till
date field.
I did get a performance boost of ~ 100% by switching from a
normal solr.DateField to a solr.TrieDateField
I am wondering that KeywordTokenizerFactory will work or
not in textfield. Actually as I understood about the
KeywordTokenizerFactory that : KeywordTokenizerFactory is
tokenize the keyword.
for example : 'solr user' will tokenize to 'solr' and
'user' because solr and user are keyword.. My
Folks,
Greetings.
Using dismax query parser is there a way to perform prefix
match. For
example: If I have a field called 'booktitle' with the
actual values as
'Code Complete', 'Coding standard 101', then I'd like to
search for the
query string 'cod' and have the dismax match against
I wanted to do phrase search. What are the analyzers
that best suited for phrase search. I tried with
textgen, but it did not yield the expected results.
I wanted to index:
my dear friend
If I search for dear friend, I should get the result and
if I search for friend dear I should
commits : 135
autocommits : 0
optimizes : 0
rollbacks : 0
expungeDeletes : 0
docsPending : 8842
adds : 8842
deletesById : 0
deletesByQuery : 0
errors : 0
cumulative_adds : 8842
cumulative_deletesById : 20390
cumulative_deletesByQuery : 0
cumulative_errors : 0
I just realized that
I've looked through the history and tried a lot of things
but can't quite get
this to work.
Used this in my last attempt:
fieldType name=lowercase
class=solr.TextField
positionIncrementGap=100
analyzer
tokenizer
class=solr.KeywordTokenizerFactory/
filter
Hi, I'm a long-time lurker,
first-time poster. I have an issue with a
search filter I need to resolve and I'm not sure how to
handle it. I
have documents like the one below that I am searching
against. The
field editionsarray is only present in the document if it
has
specific editions
recently I start to work on solr, So I am still very new to
use solr. Sorry
if I am logically wrong.
I have two table, parent and referenced (child).
for that I set multivalue field following is my schema
details
field name=id type=string indexed=true
stored=true required=true
/
My specific use case is instead of using
dataimporter.last_index_time I want
to use something like
dataimporter.updated_time_of_last_document. Our DIH is
set up to use a bunch of slave databases and there have
been problems with
some documents getting lost due to replication lag.
Can you
Hello.
I have a little problem.
i want to import an keywords-field from my database wich
looks like this:
Car, Radio, Car Radio, ...
i import this with my DIH and i analyze with the
PatternTokenizerFactory.
tokenizer class=solr.PatternTokenizerFactory
pattern=, * /
Hi all,
I have a problem with range queries on an integer field.
(Solr 1.4)
In my index, myField contains values between 0 and 3000.
field name=myField type=pint indexed=true
stored=true required=false/
Here are a few samples to give you an idea of the problem:
fq=myField:[1 TO
How do I implement a requirement like if category is xyz,
the price should
be greater than 100 for inclusion in the result set.
In other words, the result set should contain:
- all matching documents with category value not xyz
- all matching documents with category value xyz and price
I am forming a query to boost a certain ids, the list of
ids can go till
2000 too. I am sometimes getting the error for too many
clauses in the
boolean query and otherwise i am getting a null page. Can
you suggest any
config changes regarding this.
I am using solr 1.3.
For too many
Hi,
Thanks for your response. Attached are the Schema.xml and sample docs
that were indexed. The query and response are as below. The attachment
Prodsku4270257.xml has a field paymenttype whose value is 'prepaid'.
query:
I have indexed person names in solr using synonym expansion
and am getting a
match when I explicitly use that field in my query
(name:query). However,
when I copy that field into another field using copyfield
and search on that
field, I don't get a match. Below are excerpts from
I am trying to configure automatic
deduplication for SOLR 1.4 in Vufind. I followed:
http://wiki.apache.org/solr/Deduplication
Actually nothing happens. All records are being imported
without any deduplication.
Does being imported means you are using dataimporthandler? If yes you can use
I must be missing something very
obvious here. I have a filter query like so:
(-rootdir:somevalue)
I get results for that filter
However, when I OR it with another term like so I get
nothing:
((-rootdir:somevalue) OR (rootdir:somevalue AND
someboolean:true))
Simply you cannot
Now I also want to offer a slider to define the range to
include in the result set. However here I do not want to do
faceting, instead I just want to find out the min and max
date values in the result (without any of the facet filters
applies) so I know the start and end points for the
http://wiki.apache.org/solr/StatsComponent can give you
min and max values.
Sorry my bad, I just tested StatsComponent with tdate field. And it is not
working for date typed fields. Wiki says it is for numeric fields.
We have a need that
search engine return a specific URL for a specific search
term and that result is supposed to be the first result (per
Biz) among the result set.
This part seems like http://wiki.apache.org/solr/QueryElevationComponent
The URL is an external URL and
there is no intent
for my Delta-Import, i get the Id's which are should be
updatet from an
extra table in my database.
... when dih finished the delta-import it's necessary, that
the table with
the ID's is to delete.
can i put a sql query in the DIH for that issue ?
deletedPkQuery (sql query) is used in
hm i think i can use deletedPkQuery but it dont works for
me, but maybe you
can help me. here is my conifg.
entity name=item pk=id
transformer=script:BoostDoc
query=select i.id, i.shop_id, i.is_active, i.shop
...
deltaImportQuery=select i.id, i.s ...WHERE
... AND
How do I index an URL without
indexing the content? Basically our requirement is that - we
have certain search terms for which there need to be a URL
that should come right on top. I tried to use elevate option
within Solr - but from what I know - I need to have an id of
the indexed content
HI,
I want to map my solr-fields using the Customized
DataImport Handler
For ex:
I have a fields called
field column=NAME name=field1
/
field column=NO name=field2 /
Actually my column-names comes dynamically from another
table it varies from
client to client.
instead of
thats what i try ! :D
i dont want to do this with another script, because i never
know when a
delta-import is finished, and when he is completed, i dont
know with which
result. complete, fail, ?!?!?
If you are updating your index *only* with DIH, after every full/delta import
commit and
I dont know what pattern the user will configure the
columns in a separate
table.i have to read this table to map the solr-fields to
these columns ,so
i cant give dynamic fields also,and Transformers also seems
to be no use in
this case.
You don't need to know columns names. You can
how can i say that solr should start the jar after every
Delta-Import NOT
after every Full-Import ?
You cannot distinguish between delta or full. So you need to do it in your jar
program. In your java program you need to send GET method to url
http://localhost:8080/solr/dataimport
if
I am using dismax query to fetch docs from solr where I
have set some
boost to the each fields,
If I search for query Rock I get following docs with some
boost value
which I have specified,
doc
float name=score19.494072/float
int name=bitrate120/int
str
Yep content is string, and bitrate is int.
bitrate should be trie based tint, not int, for range queries work correctly.
I am digging more now Can we combine both the scenarios.
q=rockfq={!field f=content}mp3
q=rockfq:bitrate:[* TO 128]
Say if I want only mp3 from 0 to 128
You can
q=rockfq:bitrate:[* TO 128]
bitrate is int
This also return docs with more then 128 bitrate, Is there
something I am doing wrong
If you are using solr 1.4.0 you need to use
fieldType name=tint class=solr.TrieIntField precisionStep=8
omitNorms=true positionIncrementGap=0
restart
Basically for some uses cases I would like to show
duplicates for other I
wanted them ignored.
If I have overwriteDupes=false and I just create the dedup
hash how can I
query for only unique hash values... ie something like a
SQL group by.
TermsComponent maybe?
or faceting?
TermsComponent maybe?
or faceting?
q=*:*facet=truefacet.field=signatureFielddefType=lucenerows=0start=0
if you append facet.mincount=1 to above url you can
see your duplications
After re-reading your message: sometimes you want to show duplicates, sometimes
you don't want them. I
Hi to everyone, I'd like to know if
it's possible to use the *
defaultSearchField* on more fields ???
i.e.
defaultSearchField field1, field2, field3
/defaultSearchField
No. But you can query multiple fields using dismax.
qf=field1,field2,field3defType=dismax
createn an Jar-file. this jar file delete my table.
but SOLR absolute dont want to start this JAR. i put a
run.bat file into my
folder where is my jar saved. this batch-file runs and
delete the table, but
when solr start this batch-file. it doesnt work. i dont
know why. !?!?!?
i test the
I am running solr in 64 bit HP-UX system. The total
index size is about
5GB and when i try load any new document, solr tries to
merge the existing
segments first and results in following error. I could see
a temp file is
growng within index dir around 2GB in size and later it
fails with
I am getting Error in Solr
Error loading class 'Solr.TrieField'
I have added following in Types of schema file
fieldType name=tint class=solr.TrieField
omitNorms=true /
And in custom fields of schema have added
field name=bitrate type=tint indexed=true
stored=true /
I am
And the request I am passing is
/solr/select?indent=onversion=2.2q=rockfq={!field%20f=content}mp3fq:bitrate:[*
TO 127]
start=0rows=10fl=*%2Cscoreqt=dismaxwt=standardexplainOther=hl.fl
Still I am seeing documents above bitarate 127
There is a typo instead of fq: there should be fq=
In my SolrJ using application, I have a
test case which queries for “numéro” and
succeeds if I use Embedded and fails if I use CommonsHttpSolrServer… I
don’t want to use embedded for a number of reasons including that its not
recommended (http://wiki.apache.org/solr/EmbeddedSolr)
I am sorry
I have a query need that requires multiple OR conditions,
and, there must be
a match in each condition for the query to provide a
result.
The search would be * (A or B) AND (C or D)* and the only
valid results it
could turn up are:
A B
A C
B C
B D
Can anyone provide guidance as
How do we make sure that when searches for terms like
AM does not match
docs which have some thing like 5a.m etc
On analysis in admin page, it looks like
WordDelimiterFilterFactory, is
splitting on , how can i make it work so that i can
use features of word
delimiter as well make sure
Thank you. That seems to be working
well, except when I included a wild card
for any of the terms, the wildcard term isn't being found
out.
My searches are actually:
q=+(A A*) +(C C*)q.op=OR
When I do a regular search on A* or C* I get matches
but not in the
context of the above
Thanks for the response again. The best way I could
illustrate our live
search feature is an example implementation:
http://www.krop.com/
Notice when you search the word senior in the keywords
field, the results
filter down to just the job postings with that word in it.
So it's not
We don't mind the order of terms. We basically are
sorting by two variables
that are independent of relevency. So I would assume
the order doesn't
matter... we just need to make sure any results we filter
down to (as you
saw in the krop.com example) contain the words the user has
typed.
--- On Fri, 5/28/10, efr...@gmail.com efr...@gmail.com wrote:
From: efr...@gmail.com efr...@gmail.com
Subject: Re: Does SOLR Allow q= (A or B) AND (C or D)?
To: solr-user@lucene.apache.org
Date: Friday, May 28, 2010, 4:42 AM
Hi Ahmet,
Thanks again for the feedback. We will be searching
Solr 1.4
You haven't identified the version of Luke you're
using.
Luke 1.0.1 (2010-04-01)
I think with solr you need to use Release 0.9.9.1 or 0.9.9
Because solr 1.4.0 uses lucene 2.9.1
Thanks for the suggestion. I tried
0.9.9.1 but saw the same problem.
I didn't see 0.9.9 on their download page.
http://www.getopt.org/luke/ has 0.9.9 version. But that may not be the issue.
I suspect that trie based fields causing this. Because they index each value at
various levels of
I have some experience using MLT with
the StandardRequestHandler with Python
but I can't figure out how to do it with solrj. It seems
that to do
MLT with solrj I have
to use MoreLikeThisRequestHandler and there seems no way to
use
StandardRequestHandler for MLT with solrj (please correct
Hi,
I have a schema with id as
one of the fields. i index some documents
(by adding and deleting some documents).
when i perform faceting on all documents(q=*:*) with
facet.field=id, i even
get those id's for which the document is deleted
for example: (025_null,026_null
Hi Ahmet,
but i use
solrj to commit documents. and there is no commit
method which allows you to mention expungeDeletes.
Alternatively you can do it with SolrQuery.
final SolrQuery query = new SolrQuery();
query.set(qt, /update);
query.set(commit, true);
We have Highlighting enabled. We specify that we only
want highlighting on
the body property.
When doing a query like this: (body:tester OR
project_id:704) the text
highlighted in the body includes any text that is tester
and any text that
is 704.
Is there a way to prevent the
Using solr-lucene query parser, is
there a difference between using AND and using + in
querries like this:
1) q= some_field:( one AND two AND some
phrase)
2) q= some_field:(+one +two +some
phrase)
Are those always exactly identical in all respects, or are
there any differences
say my search query is new york, and i am searching
field1 and field2
for it, how do i specify that i want to exlude docs where
field3 doesnt
exist?
http://search-lucene.com/m/1o5mEk8DjX1/
i could be wrong but it seems this
way has a performance hit?
or i am missing something?
Did you read Chris's message in http://search-lucene.com/m/1o5mEk8DjX1/
He proposes alternative (more efficient) way other than [* TO *]
copyfield source=title dest=sortTitle /
Simple lowercase F is causing this. It should be copyField
P.S. Might it be helpful for Solr to complain about invalid
XML during startup? Does it do this and I'm just not
noticing?
Chris's explanation about a similar topic:
http://search-lucene.com/m/11JWX1hxL4u/
I have an issue with range queries on a long value in our
dataset (the dataset is fairly large, but i believe the
problem still exists for smaller datasets). When i
query the index with a range, as such: id:[1 TO 2000], I get
values back that are well outside that range. Its as
if the
Meanwhile, I'd like to try using POST, but I didn't find
information
about how to do this. Could someone point me to a link to
some
sample code?
you can pass METHOD.POST to query method of SolrServer.
public QueryResponse query(SolrParams params, METHOD method)
It appears that the defType parameter is not being set by the request
handler.
What do you get when you append echoParams=all to your search url?
So you have something like this entry in solrconfig.xml
requestHandler name=standard class=solr.SearchHandler default=true
lst name=defaults
str
Thanks, Ahmet.
Yes, my solrconfig.xml file is very similar to what you
wrote.
When I use echoparams=all and defType=myqp, I get:
lst name=params
str name=qhi/str
str name=echoparamsall/str
str name=defTypemyqp/str
/lst
However, when I do not use the defType (hoping it will be
solr.PhoneticFilterFactory looks suspicious. Can you verify this on solr admin
analysis.jsp page. You can debug your analysis chain in this page.
If you paste springsteen, it will show you output of each
tokenizer/tokenfilter step by step.
I'm having trouble finding a stemmer that's less aggressive
than the
porter-stemmer, ideally, one that does only plural
stemming.
Looks like PlingStemmer does this.
http://www.mpi-inf.mpg.de/yago-naga/javatools/doc/javatools/parsers/PlingStemmer.html
My understanding is that SolrJ users are supposed to escape
special characters, therefore (b) is the correct way.
If this is the case, what's the best way to escape a query
string which
might contain field names and URIs in their field values?
Easiest thing is to use RawQParserPlugin or
Is there a token filter which do the same job as
MappingCharFilterFactory but after tokenizer, reading the
same config file?
No, closest thing can be PatternReplaceFilterFactory.
http://lucene.apache.org/solr/api/org/apache/solr/analysis/PatternReplaceFilterFactory.html
I'm using SOLR 1.4 with a few
multi-cores, running under a Tomcat 6
environment. I'm using the web services to pass xml
documents for adding
records with no problem, using a URL on my development
machine of
http://localhost:8080/Solr/product/update/;
I've tried implementing an
In the solrconfig.xml I have been able to change the
hl.simple.pre/post variable, but when I try to change the
hl,regex pattern or the hl.snippets they don't have any
effect. I thought the hl.snippets would alow me to find more
than one and highlight it, and well I tried a bunch of regex
When I want sort the
documents
wich contain a certain word by date or by instituion all I
get is
an
order that I don't understand .
field name=datecreated type=date indexed=true
stored=false /
field name=instanta type=int indexed=true
stored=false
required=true /
You need to
Would someone mind explaining how this differs from the
DefaultSimilarity?
The difference is length normalization. Default one punishes long documents.
Sweet one computes to a constant norm for all lengths in
the [min,max] range (the sweet spot), and smaller norm
values for lengths out of this
Thanks. Im guessing this is all or nothing.. ie you can't
you one similarity
class for one request handler and another for a separate
request handler. Is
that correct?
correct, also re-index is required. length norms are calculated and stored at
index time.
I enter Chinese chars in the admin console for searching
matched documents, it does not return any though I have
uploaded some documents that has Chinese chars.
Could it be URI Charset Config?
http://wiki.apache.org/solr/SolrTomcat#URI_Charset_Config
Yes, it does show results when I search for id:ZS1,
COL:RW, and TI:MAC,
but strangely it does not show results when I try
AD:ZS1 or AN:ZS1.
What is the output of q=AD:ZS1debugQuery=on
Also, I'm not sure where to find the default field, so I'm
fairly certain I didn't change
I am using the sample, not deploying Solr in Tomcat. Is
there a place I can modify this setting ?
Ha, okey if you are using jetty with java -jar start.jar then it is okey.
But for Chinese you need special tokenizer since Chinese is written without
spaces between words.
tokenizer
oh yes, *...* works. thanks.
I saw tokenizer is defined in schema.xml. There are a few
places that define the tokenizer. Wondering if it is enough
to define one for:
It is better to define a brand new field type specific to Chinese.
I use solr 1.4 for search contents in documents (pdf, doc,
odt ...). I use
the module /update/extract.
When I am researching, I am limited to the first 5
characters
(approximately).
Any word or sentence after is not found (but the field has
more than 5
characters when I recovered
1 - 100 of 1782 matches
Mail list logo