Below is one of the sample slow query that takes mins!
((stock or share*) w/10 (sale or sell* or sold or bought or buy* or
purchase* or repurchase*)) w/10 (executive or director)
If a filter is used it comes in fq but what can be done about plain keyword
search?
On Sun, Mar 16, 2014 at 4:37
Hi,I am testing the solrcloud's compositeId routing but failing to get
documents pertaining to a route. PFB the steps for the same. pls point where
i am making mistake in the configuration or let me know if i have to do
something more
I'm using zookeeper 3.4.5 and two tomcat 7 server
Thanks both of us.
There was a problem with the server URL like Greg had said.
;)
El 14/03/14 16:20, Furkan KAMACI escribió:
Hi;
There is another issue. It seems like you are using SolrCloud. If so check
here: https://wiki.apache.org/solr/Solrj#Using_with_SolrCloud
Thanks;
Furkan KAMACI
I tried to use *CachedSqlEntityProcessor* in DataImportHandler with
Sub-entity query. It does not seems to be working.
Here is my query
entity name=listing dataSource=mysql query=SELECT id,make, model FROM
LISTING
entity name=account dataSource=mssql query=SELECT name,email FROM
CUSTOMER WHERE
I am running solr 4.6.1. Trying to get velocity running, but it throws
class not found error. solrconfig.xml already has the
VelocityResponseWriter added.
The Exception stack trace (snipped)
{msg=lazy loading
error,trace=org.apache.solr.common.SolrException: lazy loading error
at
Hi
instanced is collection1 in your case. Putting jars under /solr/collection1/lib
should work. What jar files did you put there? Did you include
solr-velocity-4.6.1.jar
commons-beanutils-1.7.0.jar
commons-collections-3.2.1.jar
velocity-1.7.jar
velocity-tools-2.0.jar
On Monday, March 17,
I am using Solr 4.5.1 to suggest movies for my system. What i need solr to
return not only the move_title but also the movie_id that belongs to the movie.
As an example; this is kind of what i need:response
lst name=responseHeader
int name=status0/int
int name=QTime1/int
we currently have arround 200gb in a server.
I'm aware of the RAM issue, but it somehow doesnt seems related.
I would expect search latency problems. not strange eofexceptions.
regarding the http.timeout - I didn't change anything concerning this.
Do I need to explicitly set something different
Thanks Ahmet. I had everything except solr-velocity-4.6.1.jar (which I can
understand now). Thought it was now part of solr-core itself.
Copied from dist/ to collection1/lib. Its working.
Thanks
On 17 March 2014 13:03, Ahmet Arslan iori...@yahoo.com wrote:
Hi
instanced is collection1 in
On 17 March 2014 18:13, manju16832003 manju16832...@gmail.com wrote:
I tried to use *CachedSqlEntityProcessor* in DataImportHandler with
Sub-entity query. It does not seems to be working.
Here is my query
entity name=listing dataSource=mysql query=SELECT id,make, model FROM
LISTING
entity
Hi,
I am trying to find out if solr supports doing a spatial search on multiple
location points. Basically, while querying solr, I will be giving multiple
lat-long points and solr will be returning documents which are closer to
any of the given points.
If this is not possible, is there any way
Hi Omer,
That's not how its meant to work; the suggester is giving you
potentially matching terms by looking at the set of terms for the given
field across the index.
Possibly you want to look at the MoreLikeThis component or handler? It
will return matching documents, from which you have
Perhaps index the concatenation of the
two fields, something like this:
hard rain (1998)!14
Then have the app layer peel off the !14 for
displaying the title to the user. Then use the
14 however you need to.
Best,
Erick
On Mon, Mar 17, 2014 at 6:28 AM, Lajos la...@protulae.com wrote:
Hi Omer,
Absolutely. The most straight-forward approach is to use the default
query parser comprised of OR clauses of geofilt query parser based
clauses. Another way to do it in Solr 4.7 that is probably faster is to
use WKT with the custom “buffer extension:
myLocationRptField:BUFFER(MULTIPOINT(x y, x
Hi all!
A have a solrcloud (v. 4.6.0) cluster of 5 shards. Four of them I need to
move to another server (configs and indexes). So as I understand I need to
clear zookeeper data and after restart it will be updated. My question is,
for example, in a future I need to update a specific document, is
Could someone explain me, please, the difference between addfield and
setfield in SolrInputDocument
--
View this message in context:
http://lucene.472066.n3.nabble.com/Difference-between-addfield-and-setfield-in-SolrInputDocument-tp4124809.html
Sent from the Solr - User mailing list archive at
On 3/17/2014 7:07 AM, adfel70 wrote:
we currently have arround 200gb in a server.
I'm aware of the RAM issue, but it somehow doesnt seems related.
I would expect search latency problems. not strange eofexceptions.
regarding the http.timeout - I didn't change anything concerning this.
Do I
On Mon, Mar 17, 2014 at 10:22 AM, vit bulgako...@yahoo.com wrote:
Could someone explain me, please, the difference between addfield and
setfield in SolrInputDocument
addField will add another value to any existing values for the field.
setField will just overwrite anything that is already
The algorithm is only sensitive to the shard ID, you should be able to
freely move the data to another node.
BTW, perhaps the easiest way to do this would be to set up a replica
for the shards you care about on the new hardware (assuming
connectivity) and let Solr do the synchronization for you.
Hi;
addField is works like that:
public void addField(String name, Object value, float boost )
{
SolrInputField field = _fields.get( name );
if( field == null || field.value == null ) {
setField(name, value, boost);
}
else {
field.addValue( value, boost );
}
Can anyone suggest me the best practices how to do SpellCheck and
AutoSuggest in solarium.
Can anyone give me example for that?
--
Regards,
*Sohan Kalsariya*
Erick Erickson wrote
The algorithm is only sensitive to the shard ID, you should be able to
freely move the data to another node.
BTW, perhaps the easiest way to do this would be to set up a replica
for the shards you care about on the new hardware (assuming
connectivity) and let Solr do
Hi Sohan,
The best approach for the auto suggest is using the facet query.
Please refer the link :
http://solr.pl/en/2010/10/18/solr-and-autocomplete-part-1/
Thanks,
SureshKumar.S
From: Sohan Kalsariya sohankalsar...@gmail.com
Sent: Monday, March 17,
I think it's best to use one of the many autosuggesters Lucene/Solr provide?
E.g. AnalyzingInfixSuggester is running here:
http://jirasearch.mikemccandless.com
But that's just one suggester... there are many more.
Mike McCandless
http://blog.mikemccandless.com
On Mon, Mar 17, 2014 at 10:44
Hi greg,
I added the below processor (RemoveBlankFieldUpdateProcessorFactory) I am still
getting same problem.
My XML looks like filed=Price/field
E.g Pri
-Original Message-
From: Greg Walters [mailto:greg.walt...@answers.com]
Sent: Friday, March 14, 2014 9:32 AM
To:
Martin,
You’re right, a bug was introduced by SOLOR-5354. I’ve opened an issue
https://issues.apache.org/jira/browse/SOLR-5875 and will commit the fix
shortly. I hope to include this fix in a 4.7.1 release.
Steve
On Mar 8, 2014, at 1:32 AM, Martin de Vries mar...@downnotifier.com wrote:
If am only indexing point shapes and I want to change the maxDistErr from
0.09 (1m res) to 0.00045 will this break as in searches stop working
or will search work but any performance gain won't be seen until all docs
are reindexed? Or will I have to reindex right off?
thanks,
steve
Hi,
This config works for me :
updateRequestProcessorChain name=remove
processor class=solr.TrimFieldUpdateProcessorFactory /
processor class=solr.RemoveBlankFieldUpdateProcessorFactory /
processor class=solr.RunUpdateProcessorFactory /
/updateRequestProcessorChain
We have a big Solr search application where I need to add a faceted search
for a certain request handler.
And it does not work whereas for select handler it does.
I tried to find something in the configuration but could not.
If possible, please let me know where I should look at to find the
On 3/17/2014 9:25 AM, vit wrote:
We have a big Solr search application where I need to add a faceted search
for a certain request handler.
And it does not work whereas for select handler it does.
I tried to find something in the configuration but could not.
If possible, please let me know where
Martin, I’ve committed the SOLR-5875 fix, including to the lucene_solr_4_7
branch.
Any chance you could test the fix?
Thanks,
Steve
On Mar 17, 2014, at 11:16 AM, Steve Rowe sar...@gmail.com wrote:
Martin,
You’re right, a bug was introduced by SOLOR-5354. I’ve opened an issue
Does this bug happen in the non-sharded case? --wunder
On Mar 17, 2014, at 9:15 AM, Steve Rowe sar...@gmail.com wrote:
Martin, I’ve committed the SOLR-5875 fix, including to the lucene_solr_4_7
branch.
Any chance you could test the fix?
Thanks,
Steve
On Mar 17, 2014, at 11:16 AM,
No, only QueryComponent.mergeIds() is only called for distributed queries.
Steve
On Mar 17, 2014, at 12:18 PM, Walter Underwood wun...@wunderwood.org wrote:
Does this bug happen in the non-sharded case? --wunder
On Mar 17, 2014, at 9:15 AM, Steve Rowe sar...@gmail.com wrote:
Martin, I’ve
Thanks. That is probably worth a mention in the bug and release notes. --wunder
On Mar 17, 2014, at 9:33 AM, Steve Rowe sar...@gmail.com wrote:
No, only QueryComponent.mergeIds() is only called for distributed queries.
Steve
On Mar 17, 2014, at 12:18 PM, Walter Underwood
Not sure if you have already seen this one..
http://www.solarium-project.org/2012/01/suggester-query-support/
You can also use edge N gram filter to implement typeahead auto suggest.
--
View this message in context:
For multivalued fields on atomic update, add will append a value to the
existing list of values, while set will discard the existing values and
start a fresh list of values. So, you could do a set followed by a sequence
of add's to set a new list of values for a multivalued field.
-- Jack
That's a good point -- we may not really need to bother with all that.
I guess I tend to do it partly as a way to become aware of new features.
Well, sometimes there are required additions to the schema. For
example, the _version_ field was added at some time, and you really do
need it. I
For example, given a new big department merged from three departments. A
few employees worked for two or three departments before merging. That
means, the attributes of one person might be listed under different
departments' databases. One additional problem is that one person can have
different
Otis, I want to get those spikes down lower if possible. As mentioned in
the above posts that the 25ms timing you are seeing is not really accurate
because that's the average response time for ALL requests including the
bulk add operations which are generally super fast. Our true response time
is
Hi,
we have 80 million records in index now and we are indexing 800k records
everyday.We have one shard and 4 replicas in 4 servers under solrcloud.
Currently we have 16GB heap but during indexing sometimes it is reaching
16GB and sometimes its normal. What is the reason to use the max heap at
Are your JVM running out of ram (actual exceptions) or is the used heap just
reaching 16G prior to a garbage collection? If it's the later then that is
expected behavior and is how Java's garbage collection works.
Thanks,
Greg
On Mar 17, 2014, at 1:26 PM, solr2020 psgoms...@gmail.com wrote:
See:
https://cwiki.apache.org/confluence/display/solr/De-Duplication
-- Jack Krupansky
-Original Message-
From: Mobius ReX
Sent: Monday, March 17, 2014 1:59 PM
To: solr-user@lucene.apache.org
Subject: any project for record linkage, fuzzy grouping, and deduplication
based on
It's entirely possible that you're seeing higher memory usage while indexing
due to more objects being created and abandoned. Another thing to consider
could be your commit settings. Perhaps
http://www.oracle.com/webfolder/technetwork/tutorials/obe/java/gc01/index.html
can answer some of your
previously we faced OOM when we try to index 1.2M records at the same time.
Now we divided that into two chunks and indexing twice. So now we are not
getting OOM but heap usage is more. So we are analyzing and trying to find
the cause to make sure we shouldn't get OOM again.
--
View this
On 3/17/2014 12:39 PM, solr2020 wrote:
previously we faced OOM when we try to index 1.2M records at the same time.
Now we divided that into two chunks and indexing twice. So now we are not
getting OOM but heap usage is more. So we are analyzing and trying to find
the cause to make sure we
Yes Shawn. our data source is oracle DB. Here is the datasource section
config.
dataSource name=jdbc driver=oracle.jdbc.OracleDriver
url=jdbc:oracle:thin:@dbname:port:dbname user=user
password=password
batchSize=5000 autoCommit=false
Hello,
We recently upgraded to Solr Cloud 4.7 (went from a single node Solr 4.0
instance to 3 node Solr 4.7 cluster).
Part of out application does an automated traversal of all documents that
match a specific query. It does this by iterating through results by
setting the start and rows
I should add each node has 16G of ram, 8GB of which is allocated to the
JVM. Each node has about 200k docs and happily uses only about 3 or 4gb of
ram during normal operation. It's only during this deep pagination that we
have seen OOM errors.
On Mon, Mar 17, 2014 at 3:14 PM, Mike Hugo
Hi Mike,
The OOM you’re seeing is likely a result of the bug described in (and fixed by
a commit under) SOLR-5875: https://issues.apache.org/jira/browse/SOLR-5875.
If you can build from source, it would be great if you could confirm the fix
addresses the issue you’re facing.
This fix will be
Thanks Steve,
That certainly looks like it could be the culprit. Any word on a release
date for 4.7.1? Days? Weeks? Months?
Mike
On Mon, Mar 17, 2014 at 3:31 PM, Steve Rowe sar...@gmail.com wrote:
Hi Mike,
The OOM you're seeing is likely a result of the bug described in (and
fixed by
Mike,
Days. I plan on making a 4.7.1 release candidate a week from today, and
assuming nobody finds any problems with the RC, it will be released roughly
four days thereafter (three days for voting + one day for release propogation
to the Apache mirrors): i.e., next Friday-ish.
Steve
On Mar
Thanks!
On Mon, Mar 17, 2014 at 3:47 PM, Steve Rowe sar...@gmail.com wrote:
Mike,
Days. I plan on making a 4.7.1 release candidate a week from today, and
assuming nobody finds any problems with the RC, it will be released roughly
four days thereafter (three days for voting + one day for
Hi,
A couple of times I found myself in the following situation: I had to work
on a Solr schema, but had no docs to index yet (the db was not ready etc).
In order to start learning js, I needed some small project to practice, so
I thought of this small utility. It allows you to generate fake
Hi,
The Suggest Search Component that comes preconfigured in Solr 4.7.0
solrconfig.xml seems to thread dump when I call it:
http://localhost:8983/solr/suggest?spellcheck=onq=acwt=jsonindent=true
msg:No suggester named default was configured,
Can someone tell me what's going on there?
Hi Joel, Thanks for taking a look into this. Here's the information you had
requested.*ADSKDedup:*I've attached separate files for debug information for
each query.Let me know if you need any information.Regards,Shamik
CollapsingQParserPlugin_Query_Debug.txt
Hi Steve,
I've posted previously about a nice Stackoverflow exception I got when
using this component ... can you post what you see?
I've used it successfully in with a custom dictionary like this:
searchComponent name=newsuggester class=solr.SuggestComponent
lst name=suggester
This is highly anecdotal, but I tried SOLR-1880 with 4.7 for some tests I
was running, and saw almost a 30% improvement in latency. If you¹re only
doing document selection, it¹s definitely worth having.
I¹m reasonably certain that the patch would work in 4.6 too, but the test
file relies on some
Shouldn't all deep pagination against a cluster use the new cursor mark
feature instead of 'start' and 'rows'?
4 or 5 requests still seems a very low limit to be running into an OOM
issues though, so perhaps it is both issues combined?
Ta,
Greg
On 18 March 2014 07:49, Mike Hugo
Cursor mark definitely seems like the way to go. If I can get it to work
in parallel then that's additional bonus
On Mon, Mar 17, 2014 at 5:41 PM, Greg Pendlebury
greg.pendleb...@gmail.comwrote:
Shouldn't all deep pagination against a cluster use the new cursor mark
feature instead of
My suspicion is that it won't work in parallel, but we've only just asked
the ops team to start our upgrade to look into it, so I don't have a server
yet to test. The bug identified in SOLR-5875 has put them off though :(
If things pan out as I think they will I suspect we are going to end up
On Mon, Mar 17, 2014 at 7:14 PM, Greg Pendlebury
greg.pendleb...@gmail.com wrote:
My suspicion is that it won't work in parallel
Deep paging with cursorMark does work with distributed search
(assuming that's what you meant by parallel... querying sub-shards
in parallel?).
-Yonik
Sorry, I meant one thread requesting records 1 - 1000, whilst the next
thread requests 1001 - 2000 from the same ordered result set. We've
observed several of our customers trying to harvest our data with
multi-threaded scripts that work like this. I thought it would not work
using cursor marks...
Hi,
I'm using SolrCloud 4.4 version with 2 shards having 2 replica each.
Lately, I'm observing issues where an obsolete document will suddenly show
up in search result. I'm crawling a bunch of source system on a daily basis
and updating the Solr index. Now, when I'm searching for a specific
Looks interesting. I like the admin-page-extra integration and
point-at-local-solr aspects. I looked at both approaches before as
well, but nothing in the public code.
Regards,
Alex.
Personal website: http://www.outerthoughts.com/
LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
-
Greg and I are talking about the same type of parallel.
We do the same thing - if I know there are 10,000 results, we can chunk
that up across multiple worker threads up front without having to page
through the results. We know there are 10 chunks of 1,000, so we can have
one thread process
That's great Jeff! Thanks for sharing your experience. SOLR-5768 will
make it even better.
https://issues.apache.org/jira/browse/SOLR-5768
On Tue, Mar 18, 2014 at 3:35 AM, Jeff Wartes jwar...@whitepages.com wrote:
This is highly anecdotal, but I tried SOLR-1880 with 4.7 for some tests I
was
Who knows how to index a lot of files with ExtractingRequestHandler using a
single query?
67 matches
Mail list logo