On 9/26/07, Brian Whitman [EMAIL PROTECTED] wrote:
Sami has a patch in there which used a older version of the solr
client. with the current solr client in the SVN tree, his patch
becomes much easier.
your job would be to upgrade the patch and mail it back to him so
he can update his
Hi Guys,
this question as been asked before but i was unable to find an answer
thats good for me, so hope you guys can help again
i am working on a website where we need to sort the results by distance
from the location entered by the user. I have indexed the lat and long
info for each
Hello,
For the project I'm working on now it is important to group the results
of a query by a product field. Documents
belong to only one product and there will never be more than 10
different products alltogether.
When searching through the archives I identified 3 options:
1)
Arisem is a French ISV delivering best-of-breed text analytics software. We
are using Lucene in our products since 2001 and are in search of a Lucene
expert to complement our RD team.
Required skills:
- Master degree in computer science
- 2+ years of experience in working with Lucene
-
On Sep 26, 2007, at 4:04 AM, Doğacan Güney wrote:
NUTCH-442 is one of the issues that I want to really see resolved.
Unfortunately, I haven't received many (as in, none) comments, so I
haven't made further progress on it.
I am probably your target customer but to be honest all we care about
I am new to the list and new to lucene and solr. I am considering Lucene
for a potential new application and need to know how well it scales.
Following are the parameters of the dataset.
Number of records: 7+ million
Database size: 13.3 GB
Index Size: 10.9 GB
My questions are simply:
1)
That seems well within Solr's capabilities, though you should come up
with a desired queries/sec figure.
Solr's query rate varies widely with the configuration -- how many
fields, fuzzy search, highlighting, facets, etc.
Essentially, Solr uses Lucene, a modern search core. It has performance
and
Hi,
I'm new to solr, sorry if i missed my answer in the docs somewhere...
I need 2 different solr indexes.
Should i create 2 webapps? In that case i have tomcat contexts solr and
solr2, then i can't start solr2, i get this error:
Sep 26, 2007 6:07:25 PM
My experiences so far with this level of data have been good.
Number of records: Maxed out at 8.8 million
Database size: friggin huge (100+ GB)
Index size: ~24 GB
1) It took me about a day to index 8 million docs using a non-optimized
program I wrote. It's non-optimized in the sense that it's
I have a large index with a field for a URL. For some reason or
another, sometimes a doc will get indexed with that field blank. This
is fine but I want a query to return only the set URL fields...
If I do a query like:
q=URL:[* TO *]
I get a lot of empty fields back, like:
docstr
By maxed out do you mean that Solr's performance became unacceptable
beyond 8.8M records, or that you only had 8.8M records to index? If
the former, can you share the particular symptoms?
On 9/26/07, Charlie Jackson [EMAIL PROTECTED] wrote:
My experiences so far with this level of data have been
Hi,
I am trying to create my own application using SOLR and while trying to
index my data i get
Server returned HTTP response code: 400 for URL:
http://localhost:8983/solr/update or
Server returned HTTP response code: 500 for URL:
http://localhost:8983/solr/update
Is there a way to get more
Sorry, I meant that it maxed out in the sense that my maxDoc field on
the stats page was 8.8 million, which indicates that the most docs it
has ever had was around 8.8 million. It's down to about 7.8 million
currently. I have seen no signs of a maximum number of docs Solr can
handle.
Thanks all! One last question...
If I had a collection of 2.5 billion docs and a demand averaging 200
queries per second, what's the confidence that Solr/Lucene could handle
this volume and execute search with sub-second response times?
-Original Message-
From: Charlie Jackson
No one can answer that, because it depends on how you configure Solr.
How many fields do you want to search? Are you using fuzzy search?
Facets? Highlighting?
We are searching a much smaller collection, about 250K docs, with
great success. We see 80 queries/sec on each of four servers, and
On 9/26/07, Urvashi Gadi [EMAIL PROTECTED] wrote:
Hi,
I am trying to create my own application using SOLR and while trying to
index my data i get
Server returned HTTP response code: 400 for URL:
http://localhost:8983/solr/update or
Server returned HTTP response code: 500 for URL:
It is a best practice to store the master copy of this data in a
relational database and use Solr/Lucene as a high-speed cache.
MySQL has a geographical database option, so maybe that is a better option
than Lucene indexing.
Lance
(P.s. please start new threads for new topics.)
-Original
My limited experience with larger indexes is:
1) the logistics of copying around and backing up this much data, and
2) indexing is disk-bound. We're on SAS disks and it makes no difference
between one indexing thread and a dozen (we have small records).
Smaller returns are faster. You need to
On 26-Sep-07, at 10:50 AM, Law, John wrote:
Thanks all! One last question...
If I had a collection of 2.5 billion docs and a demand averaging 200
queries per second, what's the confidence that Solr/Lucene could
handle
this volume and execute search with sub-second response times?
No
On 26-Sep-07, at 5:14 AM, Sandeep Shetty wrote:
Hi Guys,
this question as been asked before but i was unable to find an answer
thats good for me, so hope you guys can help again
i am working on a website where we need to sort the results by
distance
from the location entered by the user. I
Could someone tell me what facet is?
I have a vague idea but I am not too clear.
A pointer to a sample web site that uses Solr facet
would be very good.
Thanks.
-Kuro
With the new/improved value source functions it should be pretty easy to
develop a new best practice. You should be able to pull in the lat/lon
values from valuesource fields and then do your greater circle calculation.
- will
-Original Message-
From: Lance Norskog [mailto:[EMAIL
Dear list,
I have two questions regarding German special characters or umlaute.
is there an analyzer which automatically converts all german special
characters to their specific dissected from, such as ü to ue and ä to
ae, etc.?!
I also would like to have, that the search is always run against
Faceted search is an approach to search where a taxonomy or categorization
scheme is visible in addition to document matches.
http://www.searchtools.com/info/faceted-metadata.html
--Ezra.
On 9/26/07 3:47 PM, Teruhiko Kurosaka [EMAIL PROTECTED] wrote:
Could someone tell me what facet is?
I
Try the SnowballPorterFilterFactory described here:
http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters
You should use the German2 variant that converts ä and ae to a, ö and oe
to o and so on. More details:
http://snowball.tartarus.org/algorithms/german2/stemmer.html
Every document in
: Faceted search is an approach to search where a taxonomy or categorization
: scheme is visible in addition to document matches.
My ApacheConUS2006 talk went into a little more detail, including the best
definition of faceted searching/browsing I've ever seen...
: is there an analyzer which automatically converts all german special
: characters to their specific dissected from, such as ü to ue and ä to
: ae, etc.?!
See also the ISOLatin1TokenFilter which does this regardless of langauge.
: I also would like to have, that the search is always run
Have you guys seen Local Lucene ?
http://www.nsshutdown.com/projects/lucene/whitepaper/*locallucene*.htm
no need for mysql if you don't want too.
rgrds
Ian
Will Johnson wrote:
With the new/improved value source functions it should be pretty easy to
develop a new best practice. You should be
I've experienced a similar problem before, assuming the field type is
string (i.e. not tokenized), there is subtle yet important difference
between a field that is null (i.e. not contained in the document) and one
that is an empty string (in the document but with no value). See
i can't download it from http://jetty.mortbay.org/jetty5/plus/index.html
--
regards
jl
Your query will work if you make sure the URL field is omitted from the
document at index time when the field is blank.
adding something like:
filter class=solr.LengthFilterFactory min=1 max=1 /
to the schema field should do it without needing to ensure it is not
null or on the
: Date: Thu, 27 Sep 2007 00:12:48 -0400
: From: Ryan McKinley [EMAIL PROTECTED]
: Reply-To: solr-user@lucene.apache.org
: To: solr-user@lucene.apache.org
: Subject: Re: searching for non-empty fields
:
:
: Your query will work if you make sure the URL field is omitted from the
: document at
32 matches
Mail list logo