Hey,
yesterday we updated from solr 4.0 to solr 4.1 and since then from time to
time following error pops up:
{msg=URLDecoder: Invalid character encoding detected after position 160 of
query string / form data (while parsing as UTF-8),code=400}:
{msg=URLDecoder: Invalid character encoding
hello.
i want to compare two shards each other, because these shards should have
the same index. but this isnt so =(
so i want to find these documents, there are missing in one shard of my both
shards.
my ideas
- distrubuted shard request on my nodes and fire a facet search on my
unique-field.
hello,
in my schema
field name=city_name type=text_general indexed=false stored=true/
and i updated 18 data.
now i need indexed=true for all old data.
i need solution
please someone help me out.
please reply urgent!!
thanks
--
View this message in context:
On 12 February 2013 15:49, anurag.jain anurag.k...@gmail.com wrote:
hello,
in my schema
field name=city_name type=text_general indexed=false stored=true/
and i updated 18 data.
now i need indexed=true for all old data.
i need solution
[...]
You have no choice but to change the
Now this is strange, the index generation and index version
is changing with replication.
e.g. master has index generation 118 index version 136059533234
and slave has index generation 118 index version 136059533234
are both same.
Now add one doc to master with commit.
master has index
Hello!
The simplest way will be updating your schema.xml file, do the change
that needs to be done and fully re-index your data. Solr wont be able
to automatically change not indexed field to indexed one.
You could also use the partial document update API of Solr if you
don't have your original
Our document ID's are most definately distinct and there are partial updates
to existing records, I have run SQL queries outside of SOLR to validate
records going in and only about 1% are updates to existing records. There
are no deletes underway every day new records are added or updated. Example
I'm trying to use facets alongside grouping, however when I ask SOLR to compute
grouped facet counts (group.facet=true, see
http://wiki.apache.org/solr/FieldCollapsing) it no longer honours facet.query
excludes, however without this (group.facet=false) the exclude works again
without any
A couple of things to check.
1) Have you retained your solr logs. If so, take a look in them for
indexing errors.
2) What is the difference between maxdocs and numdocs. This will give an
indication if a large number of records are being deleted or updated.
3) Can you explain your partial updates?
Hi
I have multi shard replicated index spread across two machines.
Once a week, i delete the entire index and create it from scratch.
Today i am using ConcurrentUpdateSolrServer in solrj to add documents to the
index.
I want to add documents through both the servers.. to utilise the
Actually problem is i updated data first. and then i have to add new fields
so i made another json file
[
{
id:2131,
newfield:{add:2121}
},
{
id:21,
newfield:{add:21}
}
]
now i have two different files. so if i try to update previous file for
indexed = true. it erase new field
--
View
Hi all,
the first question:
is there a way to reduce timeout when sold shard comes up? it looks in log
file as follows:
Feb 12, 2013 1:19:08 PM org.apache.solr.cloud.ShardLeaderElectionContext
waitForReplicasToComeUp
INFO: Waiting until we see more replicas up: total=2 found=1
timeoutin=178992
Marcos,
You could consider using the CoreAdminHandler instead:
http://wiki.apache.org/solr/CoreAdmin#CoreAdminHandler
It works extremely well.
Otherwise, you should periodically restart Tomcat. I'm not sure how
much memory would be leaked, but it's likely not going to have much of
an impact
I should also say that there can easily be memory leaked from permgen
space when reloading webapps in Tomcat regardless of what resources
the app creates because class references from the context classloader
to the parent classloader can't be collected appropriately, so
restarting Tomcat
Many thanks! I will try to use the CoreAdminHandler and see if that solves the
issue!
On Feb 12, 2013, at 9:05 AM, Michael Della Bitta wrote:
I should also say that there can easily be memory leaked from permgen
space when reloading webapps in Tomcat regardless of what resources
the app
By default, on cluster startup, we wait until we see all the replicas for a
shard come up. This is for safety. You may have introduced an old shard with
old data or a new shard with no data, and you don't want something like that
becoming the leader.
If you don't want to do this wait, it's
Hi Felipe, Just a short note to say thanks for your valuable suggestion. I
had implemented that and could see expected results. The length norm still
spoils it for few fields but I balanced it with the boost factors
accordingly.
Once again, Many Thanks!
Sandeep
On 1 February 2013 22:53, Sandeep
Well, i have found the following line in
MoreLikeThisHandler$MoreListThisHelper.getMoreLikeThis(..)
// exclude current document from results realMLTQuery.add(
new TermQuery
Hello,
I have an interesting behaviour.
I have a FieldType Text_PL. This type is configured as:
fieldType name=text_pl class=solr.TextField positionIncrementGap=100
analyzer type=index
tokenizer class=solr.WhitespaceTokenizerFactory/
filter class=solr.StopFilterFactory
Apparently this was a side effect of the custom sharding feature.
There is a fix planned, but I don't know more about it than that.
Michael Della Bitta
Appinions
18 East 41st Street, 2nd Floor
New York, NY 10017-6271
www.appinions.com
Where
I know that Solr web-enables a Lucene index, but I'm trying to figure out
what other things Solr offers over Lucene. On the Solr features list it
says Solr uses the Lucene search library and extends it!, but what exactly
are the extensions from the list and what did Lucene give you? Also if I
Hi Ralf,
Dismax querparser does not allow fielded queries. e.g. field:something
Consider using edismax query parser instead.
Also debugQuery=on will display informative output how query parsed analyzed
etc.
ahmet
--- On Tue, 2/12/13, Ralf Heyde ralf.he...@gmx.de wrote:
From: Ralf Heyde
http://lucene.apache.org/solr/
On Tue, Feb 12, 2013 at 10:40 AM, JohnRodey timothydd...@yahoo.com wrote:
I know that Solr web-enables a Lucene index, but I'm trying to figure out
what other things Solr offers over Lucene. On the Solr features list it
says Solr uses the Lucene search library
Hi all,
I'm using Solr 3.3.0 with one master server and two slaves. And the problem
I'm having is that both slaves get degraded randomly but at the same time.
I am completely lost at to what the cause could be, but I see that the
tomcat that runs Solr webapp executes a PERL script that consumes
Hi,
thanks for your first Answer.
I don't want to have a fielded-query in my DisMax Query.
My DismaxQuery looks like this:
qt=dismaxq=czółenka... -- works
qt=dismaxq=czolenka... -- does not work
The accessed Fields contain the ASCIIFoldingFilter for Query Index.
So, what I need is, that
1. Show us the full query request and request handler. In particular, the
qf parameter.
2. Try the Solr Admin Analysis UI to check for sure how the analysis is
being performed.
3. Add debugQuery=true to your query to see how it is actually parsed.
4. If there is any chance that you have
I'll try to reindex - i modified the schema, but NOT re-indexed the Index.
Damn !
Original-Nachricht
Datum: Tue, 12 Feb 2013 11:14:04 -0500
Von: Jack Krupansky j...@basetechnology.com
An: solr-user@lucene.apache.org
Betreff: Re: DisMax Query Field-Filters (ASCIIFolding)
This is apples and pomegranates. Lucene is a library, Solr is a server. In
features, they are more alike than different.
wunder
On Feb 12, 2013, at 7:40 AM, JohnRodey wrote:
I know that Solr web-enables a Lucene index, but I'm trying to figure out
what other things Solr offers over Lucene.
Here's yet another short list of benefits of Solr over Lucene (not that any
of them take away from Lucene since Solr is based on Lucene):
- Multiple core index - go beyond the limits of a single lucene index
- Support for multi-core or named collections
- richer query parsers (e.g.,
Add to Jack reply, Solr can also be embed into the application and can run on
same process. Solr, the server-I zation of lucene. The line is very blurred and
solr is not a very thin wrapper around lucene library.
Most solr features are distinct from lucene like
- detailed breakdown of
On 2/12/2013 12:25 AM, adfel70 wrote:
I'm currently running a solr cluster on 10 physical machines.
I'm considering moving to virtual machines.
Any insights on this issue?
Have anyone tried this? any best practices?
You'll definitely see some performance degradation. How much is very
hard to
On 2/12/2013 1:42 AM, o.mares wrote:
yesterday we updated from solr 4.0 to solr 4.1 and since then from time to
time following error pops up:
{msg=URLDecoder: Invalid character encoding detected after position 160 of
query string / form data (while parsing as UTF-8),code=400}:
{msg=URLDecoder:
So I have had a fair amount of experience using Solr. However on a separate
project we are considering just using Lucene directly, which I have never
done. I am trying to avoid finding out late that Lucene doesn't offer what
we need and being like aw snap, it doesn't support geospatial (or
: I'm using Solr 3.3.0 with one master server and two slaves. And the problem
: I'm having is that both slaves get degraded randomly but at the same time.
: I am completely lost at to what the cause could be, but I see that the
: tomcat that runs Solr webapp executes a PERL script that consumes
I don't know how the perl script looks like. I can tell it's being ran by
tomcat because when I do : top the owner of the process says tomcat and
the CPU is at 100%.
I haven't done anything weird to my Solr installation, actually is pretty
simple and is the one it used to be on the solr website a
: So it seems that facet.query is using the analyzer of type index.
: Is it a bug or is there another analyzer type for the facet query?
That doesn't really make any sense ...
i don't know much about setting up UIMA (or what/when it logs things) but
facet.query uses the regular query parser
: I don't know how the perl script looks like. I can tell it's being ran by
: tomcat because when I do : top the owner of the process says tomcat and
: the CPU is at 100%.
...
: Do you have any idea of how to see which PERL script is being executed or
: what it's content is?
look at the
: The problem is at program startup -- when 'new HttpSolrServer(url)' is called,
: it goes and makes sure that the server is up and responsive. If any of those
: 56 object creation calls fails, then my app won't even start.
What exactly is the exception are you getting? i don't think antying
On 2/12/2013 11:19 AM, JohnRodey wrote:
So I have had a fair amount of experience using Solr. However on a separate
project we are considering just using Lucene directly, which I have never
done. I am trying to avoid finding out late that Lucene doesn't offer what
we need and being like aw
Michael is correct, that was what was said at the bootcamp (by me). I
believe this may not be correct though.
Further code review shows that Solr 4.0 was already distributing documents
using the hash range technique used in 4.1. The big change in 4.1 was that
a composite hash key could be used to
Is there a page on the wiki that points out the use cases (or the
features) that are best suited for Lucene adoption, and those best
suited for SOLR adoption?
-Glen
On Tue, Feb 12, 2013 at 3:11 PM, Shawn Heisey s...@elyograg.org wrote:
On 2/12/2013 11:19 AM, JohnRodey wrote:
So I have had a
It is like deciding between a disk drive and a file server. Solr and Lucene are
different kinds of things.
wunder
On Feb 12, 2013, at 12:26 PM, Glen Newton wrote:
Is there a page on the wiki that points out the use cases (or the
features) that are best suited for Lucene adoption, and those
And helping people - who don't know much about them - how to decide
which to use is not useful?
-Glen
On Tue, Feb 12, 2013 at 3:34 PM, Walter Underwood wun...@wunderwood.org wrote:
It is like deciding between a disk drive and a file server. Solr and Lucene
are different kinds of things.
On 2/12/2013 12:27 PM, Chris Hostetter wrote:
: The problem is at program startup -- when 'new HttpSolrServer(url)' is called,
: it goes and makes sure that the server is up and responsive. If any of those
: 56 object creation calls fails, then my app won't even start.
What exactly is the
Do you want to embed an index into your application, e.g. as a desktop
app? Use Lucene. Is search basically the whole of your app? Perhaps use
Lucene.
Do you want you offer search as a service? Do you want to be able to
arbitrarily scale your index (beyond the number of documents a single
index
: Currently, edismax applies mm to the combination of all fields listed in qf.
:
: I would like to have mm applied individually to those fields instead.
That doesn't really make sense if you think about how the qf is used to
build the final query structure -- it is essentially producing a
Hi,
I'm using this configuration:
http://wiki.apache.org/solr/DataImportHandlerDeltaQueryViaFullImport
The wiki says: In this case it means obviously that in case you also want
to use deletedPkQuery then when running the delta-import command is still
necessary.
In this link:
Yeah, solr.PointType. Or use solr.SpatialRecursivePrefixTree with
geo=false
http://wiki.apache.org/solr/SolrAdaptersForLuceneSpatial4
On 2/8/13 10:38 AM, Kissue Kissue kissue...@gmail.com wrote:
I can see Solr has the field type solr.LatLonType which supports spatial
based on longitudes and
Roman,
Logging clicks and their position in the result list is one useful method to
measure the relevance. Using the position you can calculate the mean reciprocal
rank, a value near 1.0 is very good so over time you can clearly see whether
changes actually improve user
What do you want to achieve with these tests?
Is it meant as a regression, to make sure that only the queries/boosts you
changed are affected?
Then you will have to implement tests that cover your specific
schema/boosts. I'm not aware of any frameworks that do this - we're using
Java based tests
The suggested syntax didn't work with embedded ZooKeeper:
Syntax:
-DzkRun -DzkHost=nodeA:9983,nodeB:9983,nodeC:9983,nodeD:9983/solrroot
-DnumShards=2 -Dbootstrap_confdir=./solr/collection1/conf
-Dcollection.configName=MyConfig
Error:
SEVERE: Could not start Solr. Check solr/home property and the
This config isn't intended for embedded zookeeper, it is for a separate
zookeeper ensemble that is shared with other services.
Upayavira
On Tue, Feb 12, 2013, at 10:19 PM, mbennett wrote:
The suggested syntax didn't work with embedded ZooKeeper:
Syntax:
-DzkRun
First, it may not be a problem assuming your other filter queries are more
frequent.
Second, the easiest way to keep these out of the filter cache would be just
to include them as a MUST clause, like
+(original query) +id:(1 2 3 4).
Third possibility, see
Well, what does adding debug=query show you for the parsed query? What
documents show up?
My first guess is that since you're using exclusive rather than inclusive
end points you're expectations aren't what you think.
Best
Erick
On Mon, Feb 11, 2013 at 10:57 PM, ballusethuraman
Hold on here. LBHttpSolrServer should not be used for indexing in a
Master/Slave setup, but in SolrCloud you may use it. Indeed,
CloudSolrServer uses LBHttpSolrServer under the covers.
Now, why would you want to send requests to both servers? If you're in
master/slave mode (i.e. not running
On 2/11/2013 7:47 PM, Mark Miller wrote:
Doesn't sound right to me. I'd guess you heard wrong.
I did a search for id with the quotes throughout the branch_4x source
code. After excluding test code, test files, and other things that
looked like they have good reason to be hardcoded, I was
On 2/12/2013 7:54 PM, Shawn Heisey wrote:
On 2/11/2013 7:47 PM, Mark Miller wrote:
Doesn't sound right to me. I'd guess you heard wrong.
I did a search for id with the quotes throughout the branch_4x source
code. After excluding test code, test files, and other things that
looked like they
Lucene and Solr have an aggressive upgrade schedule.From 3 to 4 got a
major rewiring,
and parts are orders of magnitude faster and smaller.
If you code using Lucene, you will never upgrade to newer versions.
(I supported SolrLucene customers for 3 years, and nobody ever did.)
Cheers,
Lance
I
Hello,
Daniel, are you looking for the original doc you used for MLT in the
response? You could always and easily do this on the client side by looking
at IDs of returned docs.
Otis
Solr ElasticSearch Support
http://sematext.com/
On Feb 12, 2013 9:26 AM, Daniel Rijkhof
Hi Roman,
We use our own Search Analytics service. It's free and open to anyone - see
http://sematext.com/search-analytics/index.html
And this post talks exactly about the topic you are asking about:
Thank you, Erick! Three great answers!
On Wed, Feb 13, 2013 at 4:20 AM, Erick Erickson erickerick...@gmail.comwrote:
First, it may not be a problem assuming your other filter queries are more
frequent.
Second, the easiest way to keep these out of the filter cache would be just
to include
On 13-Feb-2013, at 8:11 AM, Erick Erickson erickerick...@gmail.com wrote:
Hold on here. LBHttpSolrServer should not be used for indexing in a
Master/Slave setup, but in SolrCloud you may use it. Indeed,
CloudSolrServer uses LBHttpSolrServer under the covers.
In SolrCloud mode,
Hi Roman,
If you're looking for regression testing then
https://github.com/sul-dlss/rspec-solr might be worth looking at. If you're not
a ruby shop, doing something similar in another language shouldn't be to hard.
The basic idea is that you setup a set of tests like
If the query is X,
Ooh.. I dint know that there is CloudSolrServer.
Thanks for the pointer.
Will explore that.
./zahoor
On 13-Feb-2013, at 11:49 AM, J Mohamed Zahoor zah...@indix.com wrote:
On 13-Feb-2013, at 8:11 AM, Erick Erickson erickerick...@gmail.com wrote:
Hold on here. LBHttpSolrServer should not
64 matches
Mail list logo