Hi,
1)How does the commit works with multiple requests?
2)Does SOLR handle the concurrency during updates?
3)Does solr support any thing like, if I enclose the keywords within quotes,
then we are searching for exactly those keywords together. Some thing like
google does, for example if I enclose
Hi,
I'm in the process of evaluating solr and sphinx, and have come to
realize that actually having a large data set to run them against
would be handy. However, I'm pretty new to both systems, so thought
that perhaps asking around my produce something useful.
What *I* mean by largish is
Hello everyone.
I've been reading some posts on this forum and I thought it best to start my
own post as our situation is different from evveryone elses, isn't it always
:-)
We've got a django powered website that has solr as it's search engine.
We're using the example solr application and
vanderkerkoff wrote:
I found another post that suggested editing the unlockonstartup value in
solrconfig.xml.
Is that a wise idea?
If you only have a single solr instance at at time, it should be totally
fine.
yeah that is possible, I just tried on one of my solr instances..let's say
you have an index of player names:
(first-name:Tim AND last-name:Anderson) OR (first-name:Anwar AND
last-name:Johnson) OR (conference:Mountain West)
will give you the results that logically match this query..
HTH.
You might be interested in the Lucene Java contrib/Benchmark task,
which provides an indexing implementation of a download of Wikipedia
(available at http://people.apache.org/~gsingers/wikipedia/)
It is pretty trivial to convert the indexing code to send add
commands to Solr.
HTH,
Grant
Hi Yonik.
Do you have any performance statistics about those changes?
Is it possible to upgrade to this new Lucene version using the Solr 1.2
stable version?
Regards,
Daniel
On 17/9/07 17:37, Yonik Seeley [EMAIL PROTECTED] wrote:
If you want to see what performance will be like on the next
Hello! Were you able to find out anything? I'd be interested to know
what you found out.
++
| Matthew Runo
| Zappos Development
| [EMAIL PROTECTED]
| 702-943-7833
++
On Sep 15,
If you want to see what performance will be like on the next release,
you could try upgrading Solr's internal version of lucene to trunk
(current dev version)... there have been some fantastic improvements
in indexing speed.
For query speed/throughput, Solr 1.2 or trunk should do fine.
-Yonik
17 sep 2007 kl. 12.06 skrev David Welton:
I'm in the process of evaluating solr and sphinx, and have come to
realize that actually having a large data set to run them against
would be handy. However, I'm pretty new to both systems, so thought
that perhaps asking around my produce something
Jack, the JNDI-enabling jarfiles now ship as part of the main .zip
distribution. There is no need for a separate JettyPlus download as
of Jetty 6.
I used Jetty 6.1.3 (http://dist.codehaus.org/jetty/jetty-6.1.x/
jetty-6.1.3.zip) at the time, and I am using only these jarfiles from
the main
On 16-Sep-07, at 8:01 PM, erolagnab wrote:
Hi,
Just a FYI.
I've seen some posts mentioned that Solr can index 100-150 docs/s
and the
comparison between embedded solr and HTTP. I've tried to do the
indexing
with 1.7+ million docs, each doc has 30 fields among which 10
fields are
There is no way to trigger snapshots taking through Solr's admin
interface
now. Taking a snapshot is a very light-weight operation. It uses
hard
links so each snapshot doesn't take up much additional disk space. If
you
[Wu, Daniel]
It is not a concern on the snapshot performance. Rather,
Hi,
I have a collection of blogs. Each Solr document has one blog with 3 fields
- blogger(id), title and blog text.
The search is performed over all 3 fields. When doing the search I need to
show 2 things:
1. Bloggers block with all the matching bloggers (so if a title, blog or
blogger contains
: I was also suggesting a new feature to allow sending messages to Solr
: through http interface and a mechanism to handling the message on the
: Solr server; in this case, a message to trigger snapshooter script. It
: seems to me, a very useful feature to help simplify operational issues.
it's
: 1. Bloggers block with all the matching bloggers (so if a title, blog or
: blogger contains the search term, I show the blogger's id)
: The first block is my problem since it shows multiple instances of the same
: blogger if that blogger has multiple matching blogs. I can use faceting to
:
: My document will have a multivalued compound field like
:
: revision_01012007
: review_02012007
:
: i am thinking of a query like comp:type:review date:[02012007 TO
: 02282007]~0
your best bet is to change that so revision and review are the names
of a field, and do a range search on them
: How can I boost words where the whole value (not just the token) is closer to
: the front of the value? That is, I want 'ca' to return:
: 1. Canon PowerShot
: 2. Canon EX PIXMA
: 3. iPod Cable
: 4. Video Card
: (actually 12 could be swapped)
i would argue that you don't want #3 and #4 at
: Should the EdgeNGramFilter use the same term position for the ngrams within a
: single token?
i can see the argument going both ways ... imagine a hypothetical
CharSplitterTokenFilter that takes replaces each token in the stream with
one token per character in the orriginal token (ie: hello
On 9/16/07, Ryan McKinley [EMAIL PROTECTED] wrote:
Should the EdgeNGramFilter use the same term position for the ngrams
within a single token?
It feels like that is the right approach.
I don't see value in having them sequential, and I can think of uses
for having them overlap.
-Yonik
: nope, the field options are created on startup -- you can't change them
: dynamically (i don't know all the details, but I think it is a file format
: issue, not just a configuration issue)
In the underlying Lucene library most of these options can be controlled
per document, but Solr
: The corresponding entry for this field in schema.xml is :
: field name=id type=text indexed=true
: stored=true multiValued=false required=true/
i'm guessing text is from the example schema.xml ... this is not a good
type to use for a uniqueId field ... that alone might
ulimit is unlimited and cat /proc/sys/fs/file-max 11769
I just went through the same kind of mistake - ulimit doesn't report
what you think it does, what you should check is ulimit -n (the -n
isn't just the option to set the value). If you're using bash as your
shell that will almost
I've been looking at http://wiki.apache.org/solr/UserTagDesign on
and off for a while and think all the use cases could be explained
with simple UML class diagram semantics:
[Taggable](tag:Tag)-- {0..*} |--- {0..*} --(tag:Tag)[Tagger]
|
if you really want #3 and #4 to show up, then have two fields: one using
whitespace tokenizer, one using keyword tokenizer; both using
EdgeNGramFilter ... boost the query to the first field higher then the
second field (or just rely on the coordFactor and the fact that ca will
match on both
-Original Message-
From: Chris Hostetter [mailto:[EMAIL PROTECTED]
Sent: Monday, September 17, 2007 1:28 PM
To: solr-user@lucene.apache.org
Subject: RE: Triggering snapshooter through web admin interface
: I was also suggesting a new feature to allow sending messages to
Solr
:
C'est Parfait! .. yes - that was the problem.
thanks a lot.
I am compiling a complete list of FAQs - will update it in the wiki soon.
-vEnKAt
On 9/18/07, Chris Hostetter [EMAIL PROTECTED] wrote:
: The corresponding entry for this field in schema.xml is :
: field name=id
27 matches
Mail list logo