Hi,
I've been running solr 1.3 on an ec2 instance for a couple of weeks and I've
had some stability issues. It seems like I need to bounce the app once a day.
That I could live with and ultimately maybe troubleshoot, but what's more
disturbing is that three times in the last 2 weeks my index
Hi Lars,
Thanks for it: it works great.
BR
Christophe
Lars Kotthoff wrote:
I'm doing the following query:
q=text:abc AND type:typeA
And I ask to return highlighting (query.setHighlight(true);). The search
term for field type (typeA) is also highlighted in the text field.
Anyway to avoid
Hello,
I'm using Solr 1.3. I would like to get only minimum and maximum values from
a facet.
In fact I'm using a range to get the results : [value TO value], and I don't
need to get the facets list in my XML results (which could be more than
hundred thousands)... so, I have to display the range
Hello,
I am validating Sorl 1.3 now for about 3 weeks... My goal is to migrate
from Lucene to Solr because of the much better plugins and search functions.
Right now I am stress testing the performence and sending 2500 search
request via JSON protocol and from my PHPUnit testcase.
All search
On Thu, Oct 30, 2008 at 4:12 PM, Kraus, Ralf | pixelhouse GmbH
[EMAIL PROTECTED] wrote:
I am validating Sorl 1.3 now for about 3 weeks... My goal is to migrate
from Lucene to Solr because of the much better plugins and search
functions.
Very nice!
Right now I am stress testing the
is there a limit on the number of facets that i can create in
Solr?(dynamically generated facets.)
--
Jeryl Cook
/^\ Pharaoh /^\
http://pharaohofkush.blogspot.com/
Whether we bring our enemies to justice, or bring justice to our
enemies, justice will be done.
--George W. Bush, Address to a Joint
Mark Miller schrieb:
Right now I am stress testing the performence and sending 2500 search
request via JSON protocol and from my PHPUnit testcase.
All search reuqest are different so caching don´t do it for me.
Right now our old Lucene-JSPs are avout 4 times faster than my SOLR
Sollution :-(
Kraus, Ralf | pixelhouse GmbH wrote:
Mark Miller schrieb:
Right now I am stress testing the performence and sending 2500 search
request via JSON protocol and from my PHPUnit testcase.
All search reuqest are different so caching don´t do it for me.
Right now our old Lucene-JSPs are avout 4
Shalin Shekhar Mangar wrote:
On Thu, Oct 30, 2008 at 4:12 PM, Kraus, Ralf | pixelhouse GmbH
[EMAIL PROTECTED] wrote:
I am validating Sorl 1.3 now for about 3 weeks... My goal is to migrate
from Lucene to Solr because of the much better plugins and search
functions.
Very nice!
Hi,
I am trying to use Solrj for my web application. I am indexing a table
using the @Field annotation tag. Now I need to index or query multiple
tables. Like, get all the employees who are managers in Finance
department (interacting with 3 entities). How do I do that?
Does anyone have any
On Thu, Oct 30, 2008 at 5:22 PM, Kraus, Ralf | pixelhouse GmbH
[EMAIL PROTECTED] wrote:
Right now I am using this php classes to send and receiver my requests :
- Apache_Solr_Service.php
- Responce.php
It has the advantage that I don´t need to write extra JSP oder JAVA code...
On Thu, Oct 30, 2008 at 1:02 AM, Barnett, Jeffrey
[EMAIL PROTECTED] wrote:
I thought it was turned off already. ( Lucene vs Solr ?) Where do I make
this change?
Comment out this part in your solrconfig.xml
autoCommit
maxDocs2/maxDocs
maxTime4/maxTime
/autoCommit
-Yonik
Mark Miller schrieb:
Kraus, Ralf | pixelhouse GmbH wrote:
Mark Miller schrieb:
Right now I am stress testing the performence and sending 2500 search
request via JSON protocol and from my PHPUnit testcase.
All search reuqest are different so caching don´t do it for me.
Right now our old
hi ,
There are two sides to this .
1. indexing (getting data into Solr) SolrJ or DataImportHandler can be
used for this
2.querying . getting data out of solr. Here you do not have the choice
of joining multiple tables. There only one index for Solr
On Thu, Oct 30, 2008 at 5:34 PM, Raghunandan
On Thu, Oct 30, 2008 at 8:39 AM, Kraus, Ralf | pixelhouse GmbH
[EMAIL PROTECTED] wrote:
Okay okay :-) I am writing a new JSP Handler for my requests as we speak :-)
I really hope performence will be better than with {wt=javabin}
What are your requirements for requests/sec, and how many are
Thanks Noble.
So you mean to say that I need to create a view according to my query and then
index on the view and fetch?
-Original Message-
From: Noble Paul നോബിള് नोब्ळ् [mailto:[EMAIL PROTECTED]
Sent: Thursday, October 30, 2008 6:16 PM
To: solr-user@lucene.apache.org
Subject: Re:
All search reuqest are different so caching don´t do it for me.
P.S. If caching is not helping you, turn it off. It costs to populate /
maintain the cache, so if its not helping, its only hurting.
Have you gone through http://wiki.apache.org/solr/
SolrPerformanceFactors ?
Can you explain a little more about your testcase, maybe even share
code? I only know a little PHP, but maybe someone else who is better
versed might spot something.
On Oct 30, 2008, at 8:39 AM, Kraus, Ralf |
not really. you can explain your usecase and it will be more clear
On Thu, Oct 30, 2008 at 6:20 PM, Raghunandan Rao
[EMAIL PROTECTED] wrote:
Thanks Noble.
So you mean to say that I need to create a view according to my query and
then index on the view and fetch?
-Original Message-
For those attending ApacheCon in New Orleans next week, the Lucene
Search and Machine Learning Birds of a Feather (BOF) will be held
Wednesday night. Please indicate your interest at: http://wiki.apache.org/apachecon/BirdsOfaFeatherUs08
Also, note there are a number of Lucene/Solr/Mahout
On Thu, Oct 30, 2008 at 7:28 AM, Jeryl Cook [EMAIL PROTECTED] wrote:
is there a limit on the number of facets that i can create in
Solr?(dynamically generated facets.)
Not really, It's practically limited by CPU and memory, which can vary
widely with what the facet fields look like (number of
Grant Ingersoll schrieb:
Have you gone through
http://wiki.apache.org/solr/SolrPerformanceFactors ?
Can you explain a little more about your testcase, maybe even share
code? I only know a little PHP, but maybe someone else who is better
versed might spot something.
I just wrote my JSP
Generally, you need to get your head out of the database world and into
the search world to be successful with Lucene. For instance, one
of the cardinal tenets of database design is to normalize your
data. It goes against every instinct to *denormalize* your data when
creating an Lucene index
+1 - the GzipServletFilter is the way to go.
Regarding request handlers reading HTTP headers, yeah,... this will
improve, for sure.
Erik
On Oct 30, 2008, at 12:18 AM, Chris Hostetter wrote:
: You are partially right. Instead of the HTTP header , we use a
request
: parameter.
I realize you said caching won't help because the searches are
different, but what about Document caching? Is every document returned
different? What's your hit rate on the Document cache? Can you throw
memory at the problem by increasing Document cache size?
I ask all this, as the Document cache
Yeah. I'm just not sure how much benefit in terms of data transfer this will
save. Has anyone tested this to see if this is even worth it?
Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
- Original Message
From: Erik Hatcher [EMAIL PROTECTED]
To:
On Thu, Oct 30, 2008 at 2:06 AM, Bill Graham [EMAIL PROTECTED] wrote:
I've been running solr 1.3 on an ec2 instance for a couple of weeks and I've
had some stability issues. It seems like I need to bounce the app once a day.
That I could live with and ultimately maybe troubleshoot, but what's
About a factor of 2 on a small, optimized index. Gzipping took 20 seconds,
so it isn't free.
$ cd index-copy
$ du -sk
134336 .
$ gzip *
$ du -sk
62084 .
wunder
On 10/30/08 8:20 AM, Otis Gospodnetic [EMAIL PROTECTED] wrote:
Yeah. I'm just not sure how much benefit in terms of data transfer
Hi,
I have a data set with the following schema.
PersonName:Text
AnimalName:Text
PlantName:Text
lot more attributes about each of them like nick name, animal nick
name, plant generic name etc which are multually exclusive
UniqueId:long
For each of the document data set, there will be
Gziping on disk requires quite some I/O. I guess that on the fly zipping
should be faster.
C.
Walter Underwood wrote:
About a factor of 2 on a small, optimized index. Gzipping took 20 seconds,
so it isn't free.
$ cd index-copy
$ du -sk
134336 .
$ gzip *
$ du -sk
62084 .
wunder
On
CPU was at 100%, it was not IO bound. --wunder
On 10/30/08 8:58 AM, christophe [EMAIL PROTECTED] wrote:
Gziping on disk requires quite some I/O. I guess that on the fly zipping
should be faster.
C.
Walter Underwood wrote:
About a factor of 2 on a small, optimized index. Gzipping took 20
One small correction below:
Yonik Seeley wrote:
- I've seen OOM exceptions during warming. I've changed
maxWarmingSearchers=1, which I suspect will do he trick
OOM errors are really tricky - if they happen in the wrong place, it's
hard to recover gracefully from. Correctly cleaning up
Your query
AnimalName:German Shepard.
means
AnimalName:German defaultField:Shepard.
whichever the defaultField is
Try with
AnimalName:German Shepard
or
AnimalName:German AND AnimalName:Shepard.
On Thu, Oct 30, 2008 at 12:58 PM, Yerraguntla [EMAIL PROTECTED] wrote:
Hi,
I have a data
Thanks Yonik, I'll try changing the lock type to seeing how that works.
Looking closer at the logs I see the app was started at Oct 28, 2008 9:49:38,
but not long afterwards it got it's first exception when warming the index:
INFO: [] webapp=/solr path=/update params={} status=0 QTime=3
Oct 28,
: I'm doing some expirements with the morelikethis functionality using the
: standard request handler to see if it also works with distributed search (I
: saw that it will not yet work with the MoreLikeThis handler,
: https://issues.apache.org/jira/browse/SOLR-788). As far as I can see, this
:
: Yeah. I'm just not sure how much benefit in terms of data transfer this
: will save. Has anyone tested this to see if this is even worth it?
one mans trash is another mans treasure ... if you're replicating
snapshoots very frequently within a single datacenter speed is critical
and
: hundred thousands)... so, I have to display the range (minimum and maximum
: values) from a facet. Is there any way to do that?
: I found the new statistics components, follow the link :
: http://wiki.apache.org/solr/StatsComponent
: But it's for solr 1.4.
there haven't been many changes on
Hmm,
I dont have any defaultField defined in the schema.xml.
Can you give the exact syntax how it looks like in schema.xml
I have defaultSearchFieldtext/defaultSearchField.
Does it mean if sufficient requested count not available, it looks for the
search string in any of the text fields that
Never mind,
I understand now.
I have defaultSearchFieldtext/defaultSearchField.
I was searching on a string field with space in it and with no quotes.
This is causing to scan for text fields(since default search field is text)
in the schema.
Also in my schema there is an indexed
I didn't mean with defaultField that it was the way to define the default
field it in the schema, only a generic way to say default field name.
The default field name, seems to be text in your case.
If the search query doesn't say on which field to search, the word will be
searched in that
The http://wiki.apache.org/lucene-java/ImproveIndexingSpeed page suggests that
indexing will be sped up by using higher values of mergeFactor, while search
speed improves with lower values. I need to create an index using multiple
batches of documents. My question is, can I begin building
: myfacet, ASC, limit 1
: myfacet, DESC, limit 1
: So I can get the first value and the last one.
:
: Do you think I will get more performance with this way than using stats?
I'm guessing that by all measurable metrics, the StatsComponent will blow
that out of the water -- i was just putting
the only 'limit' is the effect on your query times... you could have
1000+ facets if you are ok with the response time.
Sorry to give the it depends answer, but it totally depends on your
data and your needs.
On Oct 30, 2008, at 7:28 AM, Jeryl Cook wrote:
is there a limit on the
I've actually seen cases on our site where it's possible to bring up
over 30,000 facets for one query. And they actually come up quickly -
like, 3 seconds. It takes longer for the browser to render them.
--
Steve
On Oct 30, 2008, at 6:04 PM, Ryan McKinley wrote:
the only 'limit' is
I understand what you mean..I am building a system that will
dynammically generate facets which could possible be thousands , but
at most about 6 or 7 facets will be returned using a facet ranking
algorithm so I get what you mean if I request in my query that I
want 1000 faets back compared to
wow ,30k in under 3 seconds
On 10/30/08, Stephen Weiss [EMAIL PROTECTED] wrote:
I've actually seen cases on our site where it's possible to bring up
over 30,000 facets for one query. And they actually come up quickly -
like, 3 seconds. It takes longer for the browser to render them.
--
On Thu, 30 Oct 2008 15:50:58 -0300
Jorge Solari [EMAIL PROTECTED] wrote:
copyField source=* dest=text/
in the schema file.
or use Dismax query handler.
b
_
{Beto|Norberto|Numard} Meijome
Windows: Where do you want to go today?
Linux: Where do you want to go
I have a DataImportHandler configured to index from an RSS feed. It is a
latest stuff feed. It reads the feed and indexes the 100 documents
harvested from the feed. So far, works great.
Now: a few hours later there are a different 100 lastest documents. How do
I add those to the index so I will
Thank you so much.
Here goes my Use case:
I need to search the database for collection of input parameters which touches
'n' number of tables. The data is very huge. The search query itself is so
dynamic. I use lot of views for same search. How do I make use of Solr in this
case?
On Thu, 30 Oct 2008 20:46:16 -0700
Lance Norskog [EMAIL PROTECTED] wrote:
Now: a few hours later there are a different 100 lastest documents. How do
I add those to the index so I will have 200 documents? 'full-import' throws
away the first 100. 'delta-import' is not implemented. What is the
Yes, you can change the mergeFactor. More important than the mergeFactor is
this:
ramBufferSizeMB32/ramBufferSizeMB
Pump it up as much as your hardware/JVM allows. And use appropriate -Xmx, of
course.
Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
- Original
man gzip:
-# --fast --best
Regulate the speed of compression using the specified digit #,
where -1 or --fast indicates the fastest compres-
sion method (less compression) and -9 or --best indicates the
slowest compression method (best compression). The
It could also be that the C version is a lot more efficient than
the Java version and it could take longer regardless. I could not
find a benchmark on that, but C is usually better for bit twiddling.
wunder
On 10/30/08 10:36 PM, Otis Gospodnetic [EMAIL PROTECTED] wrote:
man gzip:
-#
53 matches
Mail list logo