Hoss,
as of now I managed to adjust this in the client code before it touches the
server so it is not urgent at all anymore.
I wanted to avoid touching the client code (which is giving, oh great fun, MSIE
concurrency miseries) hence I wanted a server-side rewrite of the maximum
number of hits
Thanks, Ludovic and Jonathan. Yes, this configuration default is exactly
what I was looking for.
Ran
On Mon, Apr 11, 2011 at 7:12 PM, Jonathan Rochkind rochk...@jhu.edu wrote:
I have not worked with shards/distributed, but I think you can probably
specify them as defaults in your
hello.
my NRT-Search is not correctly configured =(
2 Solr-Instances. one searcher and one updater
the updater start every minute an update of around 3000 documents. and the
searcher start an commit ervery minute to refresh the index and read the new
doc`s
these are my Cache values for an 36
Hi Lance,
Well not actually copied over the whole configuration files, instead i just
added in the missing configuration (into a fresh copy of the example
directory).
By the directory implementation do you mean the readers used by
SolrIndexSearcher ?
These are:
reader :
Hello !
Every night within my maintenance window, during high load caused by
postgresql (vacuum analyze), i see a few (10-30) messages showing up in the
solr 3.1 logfile.
SEVERE: org.mortbay.jetty.EofException
at org.mortbay.jetty.HttpGenerator.flush(HttpGenerator.java:791)
at
i start a commit on searcher-Core with:
.../core/update?commit=truewaitFlush=false
-
--- System
One Server, 12 GB RAM, 2 Solr Instances, 7 Cores,
1 Core with 31 Million Documents other Cores 100.000
- Solr1 for
Hey folks,
The Berlin Buzzwords team recently released the schedule for
the conference on high scalability. The conference focuses on the
topics search,
data analysis and NoSQL. It is to take place on June 6/7th 2011 in Berlin.
We are looking forward to two awesome keynote speakers who shaped
my filterCache has a warmupTime from ~6000 ... but my config is like this:
LRU Cache(maxSize=3000, initialSize=50, autowarmCount=50 ...)
should i set maxSize to 50 or similar value ?
-
--- System
One Server, 12 GB RAM, 2
oooh. my queryResultCache has a warmupTime from 54000 = ~1 Minute
any suggestions ??
-
--- System
One Server, 12 GB RAM, 2 Solr Instances, 7 Cores,
1 Core with 31 Million Documents other Cores 100.000
- Solr1 for
i fighting with the same problem but with jetty.
its in this case necessary to delete also the jetty work-DIR ???
-
--- System
One Server, 12 GB RAM, 2 Solr Instances, 7 Cores,
1 Core with 31 Million Documents other Cores
Hi Lance
thanx for your reply, but I have a question
is this patch committed to trunk?
Hi all,
I am porting a previously series of Solr plugins developed for 1.4.1 version
to 3.1.0, I've written some integration tests extending the
AbstractSolrTestCase [1] utility class but now it seems that wasn't included
in the solr-core 3.1.0 artifact as it's in the solr/src/test directory. Was
Hi everyone,
My situation is the next, I need to sum the value of a field to the score to
the docs returned in the query, but not to all the docs, example:
q=car returns 3 docs
1-
name=car ford
marketValue=1
score=1.3
2-
name=car citroen
marketValue=2
score=1.3
3-
name=car mercedes
Hi,
Im trying to do somethinglike this in Solr 1.4.1
fq=category_id:(24 79)
However the values inside the parenthesis will be fetched through another
query, so far I’ve tried using _query_ but it doesnt work the way I want it
to. Here is what im trying
fq=category_id:(_query_:”{!lucene
Hi,
from time to time we're seeing a ProtocolException: Unbuffered entity
enclosing request can not be repeated. in the logs when sending ~500
docs to solr (the stack trace is at the end of the email).
I'm aware that this was discussed before (e.g. [1]) and our solution was
already to reduce the
Hello.
When is start an optimize (which takes more than 4 hours) no updates from
DIH are possible.
i thougt solr is copy the hole index and then start an optimize from the
copy and not lock the index and optimize this ... =(
any way to do both in the same time ?
-
On Tue, Apr 12, 2011 at 6:44 AM, Tommaso Teofili
tommaso.teof...@gmail.com wrote:
Hi all,
I am porting a previously series of Solr plugins developed for 1.4.1 version
to 3.1.0, I've written some integration tests extending the
AbstractSolrTestCase [1] utility class but now it seems that wasn't
Chris:
Here's the nabble URL:
http://lucene.472066.n3.nabble.com/Strip-spaces-and-new-line-characters-from-data-tp2795453p2795453.html
The message in the Solr list is from alexei on 8-April. Strip spaces and
newline characters from data.
This started happening a couple (?) of weeks ago and I
FWIW, I see the xml I just sent in gMail, so I'm guessing things are over on
the nabble side, but I have very little evidence..
Erick
P.S. It's not a huge deal, getting to the correct message on nabble is just
a click away. But it is a bit annoying.
On Tue, Apr 12, 2011 at 8:38 AM, Erick
Make sure streaming is on.
-- how to check ?
-
--- System
One Server, 12 GB RAM, 2 Solr Instances, 7 Cores,
1 Core with 31 Million Documents other Cores 100.000
- Solr1 for Search-Requests - commit every Minute - 5GB
Hi,
I did not want to hijack this thread (
http://www.mail-archive.com/solr-user@lucene.apache.org/msg34181.html)
but I am experiencing the same exact problem mentioned here.
To sum up the issue, I am getting intermittent Unavailable Service
exception during indexing commit phase.
I know that I
I've asked on Nabble if they know of a fix for the problem:
http://nabble-support.1.n2.nabble.com/solr-dev-mailing-list-tp6023495p6264955.html
Steve
-Original Message-
From: Erick Erickson [mailto:erickerick...@gmail.com]
Sent: Tuesday, April 12, 2011 8:43 AM
To: Chris Hostetter
If your commit from the client fails, you don't really know the
state of your index anyway. All the threads you have sending
documents to Solr are adding them to a single internal buffer.
Committing flushes that buffer.
So if thread 1 gets an error on commit, it will presumably
have some
Sorry, fat fingers. Sent that last e-mail inadvertently.
Anyway, if I have this correct, I'd recommend going to
autocommit and NOT committing from the clients. That's
usually the recommended procedure.
This is especially true if you have a master/slave setup,
because each commit from each client
Hi,
I have been trying to perform a search using a CommonsHttpSolrServer when my
postCommit event listener is called.
I am not able to find the documents just commited; the post in postCommit
caused me to assume that I would; it seems that the commit only takes effect
when all postCommit have
Try using AND (or set q.op):
q=car+AND+_val_:marketValue
On Apr 12, 2011, at 07:11 , Marco Martinez wrote:
Hi everyone,
My situation is the next, I need to sum the value of a field to the score to
the docs returned in the query, but not to all the docs, example:
q=car returns 3 docs
Hi
I would like to build a component that during indexing analyses all tokens
in a stream and adds metadata to a new field based on my analysis. I have
different tasks that I would like to perform, like basic classification and
certain more advanced phrase detections. How would I do this? A
Thanks Robert, that was very useful :)
Tommaso
2011/4/12 Robert Muir rcm...@gmail.com
On Tue, Apr 12, 2011 at 6:44 AM, Tommaso Teofili
tommaso.teof...@gmail.com wrote:
Hi all,
I am porting a previously series of Solr plugins developed for 1.4.1
version
to 3.1.0, I've written some
Thanks but I tried this and I saw that this work in a standard scenario, but
in my query i use a my own query parser and it seems that they dont doing
the AND and returns all the docs in the index:
My query:
_query_:{!bm25}car AND _val_:marketValue - 67000 docs returned
Solr query parser
car
:
: Here's the nabble URL:
:
:
http://lucene.472066.n3.nabble.com/Strip-spaces-and-new-line-characters-from-data-tp2795453p2795453.html
:
: The message in the Solr list is from alexei on 8-April. Strip spaces and
: newline characters from data.
And the raw message as recieved by apache...
On 4/12/2011 6:21 AM, stockii wrote:
Hello.
When is start an optimize (which takes more than 4 hours) no updates from
DIH are possible.
i thougt solr is copy the hole index and then start an optimize from the
copy and not lock the index and optimize this ... =(
any way to do both in the same
You can index and optimize at the same time. The current limitation
or pause is when the ram buffer is flushing to disk, however that's
changing with the DocumentsWriterPerThread implementation, eg,
LUCENE-2324.
On Tue, Apr 12, 2011 at 8:34 AM, Shawn Heisey s...@elyograg.org wrote:
On 4/12/2011
I'm not sure it's a 100% solution but the new path hierarchy tokenizer
seems promising. I've only played with a little bit with a little too
booze and not enough sleep (in the sky) so apologies for the
potty-mouth-ness of this blog post.
http://www.aaronland.info/weblog/2011/04/02/status/#sky
I have 1 master, and 2 slaves setup with 1.30 collection distribution. My
frontwed web application does query to the master, do I need to change any
code in the web application to query on the slaves? or does the master
requests query from the slaves automatcially? Please help thx.
Erick,
My setup is not quite the way you described. I have multiple threads
indexing simultaneously, but I only have 1 thread doing the commit after all
indexing threads finished. I have multiple instances of this running each
in their own java vm. I'm ok with throwing out all the docs indexed
Yes. You need to put, say, a load balancer on front of your slaves
and distribute the requests to the slave.
Best
Erick
On Tue, Apr 12, 2011 at 2:20 PM, Li Tan litan1...@gmail.com wrote:
I have 1 master, and 2 slaves setup with 1.30 collection distribution. My
frontwed web application does
See below:
On Tue, Apr 12, 2011 at 2:21 PM, Phong Dais phong.gd...@gmail.com wrote:
Erick,
My setup is not quite the way you described. I have multiple threads
indexing simultaneously, but I only have 1 thread doing the commit after
all
indexing threads finished. I have multiple
Hi,
I have been trying to get spellcheck to work in the Chinese language. So far
I have not had any luck. Can someone shed some light here as a general guide
line in terms of what need to happen?
I am using the CJKAnalyzer in the text field type and searching works fine,
but spelling does not
Did this go to the list? I think I may need to resubscribe...
Sent from my iPhone
On Apr 12, 2011, at 12:55 AM, Estrada Groups estrada.adam.gro...@gmail.com
wrote:
Has anyone tried doing this? Got any tips for someone getting started?
Thanks,
Adam
Sent from my iPhone
Thanks Eric, I thought the master does automatically when you setup collection
distribution. I wish there are more document for 1.3 collection distribution.
Do you know how to show the slave stats on the Master admin page, the
distribution tab? Thanks in advance guys.
Sent from my iPhone
On
It did: http://search-lucene.com/?q=panaramio
Otis
Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/
- Original Message
From: Estrada Groups estrada.adam.gro...@gmail.com
To: Estrada Groups
Hi,
Does spellchecking in Chinese actually make sense? I once asked a native
Chinese speaker about that and the person told me it didn't really make sense.
Anyhow, with n-grams, I don't think this could technically work even if it made
sense for Chinese, could it?
Otis
Sematext ::
If I follow things correctly, I think you should be seeing new documents only
after the commit is done and the new index searcher is open and available for
search. If you are searching before the new searcher is available, you are
probably still hitting the old searcher.
Otis
Sematext ::
Hi,
I did Flickr into Lucene about 3 years ago. There is a Flickr API,
which covers almost everything you need (as I remember, not always
Flickr feature was implemented at that time in the API, like the
collection was not searchable). You can harvest by user ID or
searching for a topic. You can
On Tue, Apr 12, 2011 at 10:25 AM, Marco Martinez
mmarti...@paradigmatecnologico.com wrote:
Thanks but I tried this and I saw that this work in a standard scenario, but
in my query i use a my own query parser and it seems that they dont doing
the AND and returns all the docs in the index:
My
It doesn't make sense to spell check individual character sized words,
but makes a lot of sense for phrases. Due to pervasive use of pinyin
IM, it's very easy to write phrases that are totally wrong in
semantics and but sounds correct. n-gram should work if it doesn't
mangle the characters.
On
Hi Hoss,
thanks for your response...
you are right I got a typo in my question, but I did use maxSegments, and
here is the exactly url I used:
curl
'http://localhost:8080/solr/97/update?optimize=truemaxSegments=10waitFlush=true'
I used jconsole and du -sk to monitor each partial optimize, and
Thanks Otis and Luke.
Yes it does make sense to spellcheck phrases in Chinese. Looks like the
default Solr spellCheck component is already doing some kind of NGram-ing.
When examining the spellCheck index, I did see gram1, gram2, gram3, gram4...
The problem is no Chinese terms were indexed into
: /tmp # ls /xxx/solr/data/32455077/index | wc --- this is the
start point, 150 seg files
: 150 150 946
: /tmp # time curl
the number of files i nthe index directory is not the number of
segments
the number of segments is an internal lucene concept that impacts
Thanks Peter! I am thinking that I may just use Nutch to do the crawl and index
off of these sites. I need to check out the APIs for each to make sure I'm not
missing anything related to the geospatial data for each image. Obviously both
do the extraction when the images are uploaded so I'm
I am hoping to get some feedback on the architecture I've been planning
for a medium to high volume site. This is my first time working
with Solr, so I want to be sure what I'm planning isn't totally weird,
unsupported, etc.
We've got a a pair of F5 loadbalancers and 4 hosts. 2 of those hosts
I think the repeaters are misleading you a bit here. The purpose of a
repeater is
usually to replicate across a slow network, say in a remote data
center, then slaves at that center can get more timely updates. I don't
think
they add anything to your disaster recovery scenario.
So I'll ignore
ManifoldCF sounds like it might be the right solution, so long as it's
not secretly building a filter query in the back end, otherwise it
will hit the same limits.
In the meantime, I have made a minor improvement to my filter query;
it now scans the permitted IDs and attempts to build a filter
Hi Parker,
Lovely ASCII art. :)
Yes, I think you can simplify this by introducing shared storage (e.g., SAN)
that hosts the index to which you active/primary master writes. When your
primary master dies, you start your stand-by master that is configured to point
to the same index. If there
54 matches
Mail list logo