2010/2/15 Toke Eskildsen t...@statsbiblioteket.dk:
From: Tim Terlegård [tim.terleg...@gmail.com]
If the index size is more than you can have in RAM, do you recommend
to split the index to several servers so it can all be in RAM?
I do expect phrase queries. Total index size is 107 GB. *prx
Hi,
I have to set up a SOLR cluster with some availability concept (is allowed to
require manual interaction on fault, however, if there is a better way, I'd be
interested in recommendations).
I have two servers (A and B for the example) at my disposal.
What I was thinking about was the
Hi everyone,
in our app we sometimes use solr programmatically to retrieve all the
elements that have a certain value in a single-valued single-token
field ( brand:xxx).
Since we are not interested in scoring this results, I was thinking
that maybe this should be performed as a filterQuery
Hi Shalin!
Thanks for quick response. Sadly it tells me, that i have to look elsewhere to
fix the problem.
Anyone an idea what could cause the increasing warmup-Times? If required I can
post some stats.
Thanking you in anticipation!
Regards,
Sven
Feed: Solr-Mailing-List
Hi,
i have indexed some data on solr 1.3.0. Now i wanna upgrade to solr
1.4.0 but on the same data.
so here are the following steps i performed:
1. extract solr 1.4.0
2. copied the conf and data folder of my index from solr
1.3.0/examples/multicore to solr1.4.0/examples/multicore/
3.
Hi there,
Is there any analysis out there that may help to choose between Tomcat and
Jetty to deploy Solr? I wonder wether there's a significant difference
between them in terms of performance.
Any advice would be much appreciated,
-Steve
On Tue, Feb 16, 2010 at 2:04 PM, NarasimhaRaju rajux...@yahoo.com wrote:
Hi,
using filterQuery(fq) is more efficient because SolrIndexSearcher will make
use of filterCache
and in your case it returns entire set from the cache instead of searching
from the entire index.
more info about
I'd doubt if a performance benchmark would be very useful, it ultimately
depends on what you are trying to do and what you are comfortable with.
We've had successful deployments on both.
Any difference in performance is far outweighed by ease of setup/support that
you personally find in
How can we get instance of IndexSchema object in Tokenizer subclass?
Hi,
When we index using SOLR, we have an option called multivalued. How does
that work with multiple files associated with same document.
For example: submiting a form with some fields + list of pdf files
index process:
1) considering all the form fields as individual solr input document fields
Hello ,
Thanks. That clears my doubts.Coming to the point two, Can
you please tell me which part of the Similarity takes care of the
same. Is it possible to implement in such a way that we give more
preference to number of found terms. Also, here in our case we need
to give more
Hi,
Can anybody tell me if [1] still applies as of version trunk 03/02/2010 ? I
am removing documents from my index using deletedPkQuery and a deltaimport.
I can tell from the logs that the removal seems to be working:
16-Feb-2010 15:32:54 org.apache.solr.handler.dataimport.DocBuilder
Unless you have *evidence* that the indexing each pdf with
the form data as a single SOLR document is a problem,
I would just index the fields with each document rather
than try to index the PDFs as multivalued. The space
used by duplicating the form field data is probably a
tiny fraction of the
thanks .
Is it possible to do date faceting on multiple solr shards?
I am using index created in two different shards to do date faceting on
field DATE
*
On a related note. Maybe it'd be good to have wiki page of
experiences and possibly stats of various SSD drives? Either on
Lucene or Solr wiki sites?
2010/2/16 Tim Terlegård tim.terleg...@gmail.com:
2010/2/15 Toke Eskildsen t...@statsbiblioteket.dk:
From: Tim Terlegård
Hi all,
Trying to debug a very sneaky bug in a small Solr extension that I
wrote, and I've come across an odd situation. Here's what my test
suite does:
deleteByQuery(*:*);
// add some documents
commit();
// test the search
This works fine. The test suite that exposed the error (which is
Mat Brown wrote:
Hi all,
Trying to debug a very sneaky bug in a small Solr extension that I
wrote, and I've come across an odd situation. Here's what my test
suite does:
deleteByQuery(*:*);
// add some documents
commit();
// test the search
This works fine. The test suite that exposed
Hello ,
Thanks. That clears my
doubts. Coming to the point two, Can
you please tell me which part of the Similarity takes care
of the
same. Is it possible to implement in such a way that we
give more
preference to number of found terms.
public float coord(int overlap, int
Cool, thanks - just wanted to make sure I'm not insane. Makes sense
that there would be a difference if the index is built fresh in that
case.
On Tue, Feb 16, 2010 at 11:59, Mark Miller markrmil...@gmail.com wrote:
Mat Brown wrote:
Hi all,
Trying to debug a very sneaky bug in a small Solr
Hi @all,
I am getting the same recursive-concatenated results as the guys in the
comments (http://issues.apache.org/jira/browse/SOLR-64). I couldn't get
hiefacets working wether with release-1.4.0 nor with branch-1.4.0. I've got
a 1.4.0-dev incl. SOLR-64 running and in parallel a 1.4.0-final. I
It definitely had something to do with omitTermFreqAndPosition. As soon as I
disabled the option and re-indexed, my queries starting working as expected.I
suspect it has to something to do with terms occupying the same position and
losing that information by using omitTermFreqAndPositions, but
I've got a task open to upgrade to 0.6. Will try to get to it this week.
Upgrading is usually pretty trivial.
On Feb 14, 2010, at 12:37 AM, Liam O'Boyle wrote:
Afternoon,
I've got a large collections of documents which I'm attempting to add to
a Solr index using Tika via the
any hints?
--
View this message in context:
http://old.nabble.com/How-to-retrieve-relevance-%22debug-explain%22-info-in-code--tp27602530p27612814.html
Sent from the Solr - User mailing list archive at Nabble.com.
Hi,
any hints or suggestions?
Does anyone do the updating this way?
Regards,
Peter.
Hi solr community!
Is it recommended to replace the data directory of a heavy used solr
instance?
(I am aware of the http queries, but that will be too slow)
I need a fast way to push development data to
Any details? This is pretty ambiguous
tacking debugQuery=true to a URL brings back some stuff
in Lucene, IndexSearcher.explain()?
Erick
On Tue, Feb 16, 2010 at 1:21 PM, uwdanny uwda...@gmail.com wrote:
any hints?
--
View this message in context:
Hi All
Is there any tool to analyze corrupted data in Solr. I am aware of luke.
But does it shows somehow that the data is corrupted?
Like some segments are missing or whether some documents have been corrupted
- not fully indexed?
Thanks
Dipti
Hi erick, thanks for the reply.
my query url includes debugQuery=on and the result page is correctly
showing all the debug / explain info. the problem I'm facing is that I
cannot get the same debug/explain info in code. I've been trying
IndexSearcher.explain(Weight, int ) API, as well as
I've setup a simple DIH import handler with Solr that connects via a database
to my data.
I have a small worry though. When I call the full-import functions, can I
configure Solr (via the XML files) to make sure there are rows to index before
wiping everything? What worries me is if, for some
Hello,
I'm interested in using Solr with a custom Lucene Filter (like the one
described in section 6.4.1 of the Lucene In Action, Second Edition book). I'd
like to filter search results from a Lucene index against information stored in
a relational database. I don't want to move the
Hello ,
Thanks for your detailed explaination.
Do you want to punish *more* long documents?
Not alot, but a bit more than default implementation. It seems
lengthNorm is field based and pinushing lengthy fields does fit most
of the cases in our project.
There will be a trade-off
Hi everyone,
I am attempting to implement a faceted drill down feature with Solr. I am
having problems explaining some results of the fq parameter.
Let's say I have two fields, 'people' and 'category'. I do a search for 'dog'
and ask to facet on the people and category fields.
I am told that
Hi,
I've read very interesting interview with Ryan,
http://www.lucidimagination.com/Community/Hear-from-the-Experts/Podcasts-and
-Videos/Interview-Ryan-McKinley
Another finding is
https://issues.apache.org/jira/browse/SOLR-773
(lucene/contrib/spatial)
Is there any more staff going on for SOLR
Hi
Ups, sorry. I didn't recognized the answer because it was in the bulk
folder.
I though with this procedure it will be a lot faster and less overhead.
Just two lines of shell script.
What do you think?
Regards,
Peter.
This should work on Linux. The rsync based replication scripts used
Problem solved. I wasn't quoting the value. Since I was using names such as
'Gary Bettman' solr must have been giving all the Garys.
-Original Message-
From: Nagelberg, Kallin [mailto:knagelb...@globeandmail.com]
Sent: Tuesday, February 16, 2010 3:22 PM
To: 'solr-user@lucene.apache.org'
Thanks Hoss
Apology for flooding the post.
But still i cant stop thinking about this.
i deleted my entire index and now i have 0 documents.
Now if i make a query with accrd i still get a suggestion of accord even
though there are no document returned since i deleted my entire index. i
hope it
Hi Jon,
You will need to write a plugin
You will need custom Query parser and an Update Handler depending on what
you are doing.
The implementation of an Update Handler or Update Request Processor is not
recommended because it is considered to be advanced.
Take a look at the following links
Hi all!
I'm trying to join 2 indexes together to produce a final result using only Solr
+ Velocity Response Writer.
The problem is that each hit of the main index contains references to some
common documents located in another index. For example, the hit could have a
field that describes in
Hi Israel (et al),
I don't think that I need an Update Handler; I don't intend to change the
values in the search index (in fact, the goal is to build a Lucene index with
Hadoop and then point a Solr instance at it).
What I'm trying to do is split the document into two locations: one is the
After getting aware of all
these combinations, it seems not
wise to proceed blindly by punushing what ever we want.
Thank you very
much for letting me know.
Generally most of the people are happy with default solr scoring. Especially in
web like search.
I am not sure but you can find this
update - found the answer
API getExplainList in org.apache.solr.util.SolrPluginUtils
works.
uwdanny wrote:
Hi,
I was trying to get the detailed explain info in (java) code using the
APIs, see codes below,
-
ResponseBuilder rb (from some inherited process
Hi,
It seems that when I do a search with a wildcard (eg, +text:abc*) the Solr
standard SearchHandler will construct a ConstantScoreQuery passing in a
Filter, so all the documents in the result set are scored the same. Is there
a way to make Solr construct a BooleanQuery instead so that scoring
It seems that when I do a search with a wildcard (eg,
+text:abc*) the Solr
standard SearchHandler will construct a ConstantScoreQuery
passing in a
Filter, so all the documents in the result set are scored
the same. Is there
a way to make Solr construct a BooleanQuery instead so that
: According to this email exchange between Koji and Mat Brown,
:
: http://www.mail-archive.com/solr-user@lucene.apache.org/msg23759.html
:
: The boost value from copyField's shouldn't be accumulated into the boost for
: the text field, can anyone else verify this? This seem to go against what
: I need to do a search that will search 3 different fields and combine
: the results. First, it needs to not break the phrase into tokens, but
: rather treat it is a phrase for one field. The other fields need to be
: parsed with their normal analyzers.
your description of your goal is a
Greetings,
It's time for another awesome Seattle Hadoop/Lucene/Scalability/NoSQL Meetup!
As always, it's at the University of Washington, Allen Computer
Science building, Room 303 at 6:45pm. You can find a map here:
http://www.washington.edu/home/maps/southcentral.html?cse
Last month, we had a
: I want to know How can I set request timeout through perl by
: webservice::solr end or solr end so that I could hanlde request timeout
I've never used WebService::Solr, but it's docs say it takes in a user
agent object, (ie: LWP::UserAgent) so that's where you can specify the
client side
:i have indexed some data on solr 1.3.0. Now i wanna upgrade to solr
: 1.4.0 but on the same data.
: so here are the following steps i performed:
: 1. extract solr 1.4.0
: 2. copied the conf and data folder of my index from solr
: 1.3.0/examples/multicore to solr1.4.0/examples/multicore/
:
Jan Hoydal / Otis,
First off, Thanks for mentioning us. We do use some utility functions from
SOLR but our index engine is built on top of Lucene only, there are no Solr
cores involved. We do have a JOIN operator that allows us to perform
relational searches while still acting like a search
Chris Hostetter wrote:
: According to this email exchange between Koji and Mat Brown,
:
: http://www.mail-archive.com/solr-user@lucene.apache.org/msg23759.html
:
: The boost value from copyField's shouldn't be accumulated into the boost for
: the text field, can anyone else verify this? This
: I have a small worry though. When I call the full-import functions, can
: I configure Solr (via the XML files) to make sure there are rows to
: index before wiping everything? What worries me is if, for some unknown
: reason, we have an empty database, then the full-import will just wipe
:
: I'm interested in using Solr with a custom Lucene Filter (like the one
: described in section 6.4.1 of the Lucene In Action, Second Edition
: book). I'd like to filter search results from a Lucene index against
: information stored in a relational database. I don't want to move the
:
: But still i cant stop thinking about this.
: i deleted my entire index and now i have 0 documents.
:
: Now if i make a query with accrd i still get a suggestion of accord even
: though there are no document returned since i deleted my entire index. i
: hope it also clear the spell check index
: no but you can set a default for the qf parameter with the same value
good call...
https://issues.apache.org/jira/browse/SOLR-1776
-Hoss
Thanks for bringing closure.
Erick
On Tue, Feb 16, 2010 at 7:13 PM, uwdanny uwda...@gmail.com wrote:
update - found the answer
API getExplainList in org.apache.solr.util.SolrPluginUtils
works.
uwdanny wrote:
Hi,
I was trying to get the detailed explain info in (java) code
It's generally a bad idea to try to think of
various SOLR/Lucene indexes in a database-like
way, Lucene isn't built to do RDBMS-like stuff. The
first suggestion is usually to consider flattening
your data. That would be something like
adding NY and New York in each document.
If that's not
A problem is that your profanity list will not stop growing, and with
each new word you will want to rescrub the index.
We had a thousand-word NOT clause in every query (a filter query would
be true for 99% of the index) until we switched to another
arrangement.
Another small problem was that I
The data copied from title to content is exactly the strings that you
give. The data is copied around, then each field is analyzed. Changing
'title' from text to string makes no difference.
On Mon, Feb 15, 2010 at 6:48 AM, adeelmahmood adeelmahm...@gmail.com wrote:
I am just trying to
When you change an index you do not have to copy the entire index
again. The new part of the index is in separate files and the
replication code knows to only pull the differences.
Indexing on a master and copying to slaves works very well - there are
thousands of Solr installations using that
This is the CheckIndex program in Lucene. I don't have a link handy
for running it, but it is in the lucene-core jar file in solr/lib.
On Tue, Feb 16, 2010 at 11:08 AM, dipti khullar dipti.khul...@gmail.com wrote:
Hi All
Is there any tool to analyze corrupted data in Solr. I am aware of luke.
Norms are generally not calculated. You need to change the field you
want with this attribute: omitNorms=false.
On Tue, Feb 16, 2010 at 2:38 PM, Ahmet Arslan iori...@yahoo.com wrote:
After getting aware of all
these combinations, it seems not
wise to proceed blindly by punushing what ever we
These are some very large numbers. 700k ms is 70 seconds, 4M ms is 4k
seconds or 66 minutes. No Solr installation should take this long to
warm up.
There is something very wrong here. Have you optimized lately? What
queries do you run to warm it up? And, the basics: how many documents,
how much
On Wed, Feb 17, 2010 at 8:03 AM, Chris Hostetter
hossman_luc...@fucit.org wrote:
: I have a small worry though. When I call the full-import functions, can
: I configure Solr (via the XML files) to make sure there are rows to
: index before wiping everything? What worries me is if, for some
I've actually run into this issue; huge, 30 minute warm up times. I've
found that reducing the auto-warm count on caches (and the general size
of the cache) helped a -lot-, as did making sure my warm up query wasn't
something like:
q=*:*facet=truefacet.field=somethingWithAWholeLotOfTerms
Hi,
Solr home: 1.3.0/examples/multicore
Type of Queries: Recursive e.g. I search in the index for some name that
returns some rows. For each row there is a field called parentid which is a
unique key for some other row in the index. The next queries search the
index for the parentid . This
: I belive Koji was mistaken. looking at DocumentBuilder.toDocument, the
: boosts have been propogated to copyField destinations since that method was
: added in 2007 (initially it didn't deal with copyfields at all, but once
: that was fixed it copied the boosts as well.)
...
: Hmm,
Thanks Ron. Actually, I'm developing a Web search engine. Would that
matter?
Thanks.
2010/2/16 Ron Chan rc...@i-tao.com
I'd doubt if a performance benchmark would be very useful, it ultimately
depends on what you are trying to do and what you are comfortable with.
We've had successful
66 matches
Mail list logo