absolutely, that's what I didn't get in your initial question. Okay it
seems you are talking about typical eCommerce search problem. I will speak
about it at http://www.apachecon.eu/schedule/presentation/18/ see you.
On Fri, Oct 5, 2012 at 9:47 AM, rhl4tr rhl4...@gmail.com wrote:
But user query
what's the value of rows param
http://wiki.apache.org/solr/CommonQueryParameters#rows ?
On Fri, Oct 5, 2012 at 6:56 AM, Aaron Daubman daub...@gmail.com wrote:
Greetings,
I've been seeing this call chain come up fairly frequently when
debugging longer-QTime queries under Solr 3.6.1 but have
Hi all,
I wanna have a field for each document which will simply store the doc's
position ( rank, not its score ) for each query. so for each different query
it will show the doc's new rank within the whole search result...
I have been digging the source code ( 4.0 Beta ) but for now couldnt
Hi,
Im generating SOLR from DB with below dataConfig section in data-config.xml
file, and it's working fine.
dataConfig
dataSource type=JdbcDataSource
driver=com.microsoft.sqlserver.jdbc.SQLServerDriver
burl=jdbc:sqlserver://127.0.0.1;databaseName=emp user=user
password=user*/
On Fri, Oct 5, 2012 at 4:33 AM, Mikhail Khludnev
mkhlud...@griddynamics.com wrote:
what's the value of rows param
http://wiki.apache.org/solr/CommonQueryParameters#rows ?
Very interesting question - so, for historic reasons lost to me, we
pass in a huge (1000?) number for rows and this hits
I have only pencil scratches yet, can't share it. I can say that i've found
it quite close to approach described there
http://www.ulakha.com/publications.html it's called there Concept Search,
but as far as I understand I have rather different implementation approach.
On Fri, Oct 5, 2012 at 2:31
okay. huge rows value is no.1 way to kill Lucene. It's not possible,
absolutely. You need to rethink logic of your component. Check Solr's
FieldCollapsing code, IIRC it makes second search to achieve similar goal.
Also check PostFilter and DelegatingCollector classes, their approach can
also be
I'm a little confused about what you actually expect to see. I mean, it
sounds like all you are doing is numbering N query results as positions
1..N. But that's too obvious to be useful. Maybe you could provide an
example.
Or are you talking about query refinement, where you do one query and
I _think_ I have this right...
ReplicationFactor is the maximum number of extra replicas per shard.
If you don't
specify this, then as you bring up more and more nodes, the new nodes get
assigned on a round-robin basis to shards. This allows you to have heterogeneous
collections and not have
Can anyone point to a document that describes the meanings behind the
different solrcloud graph shard colors?
I've have several that are orange now with two as the active shard and
our total index count is less than it was a day before. The logs
aren't indicating anything in particular.
Thanks,
Hi,
Thank you for the reply Davide.
Writing to db you mean to insert into db the search queries? I was
thinking that this might effect search performance?
Yes you are right, Getting stats for particular key word is tough. It would
suffice if I can get q param and fq param values( when we
If you think this could be a problem for your performances you can try two
different solutions:
1 - Make the call to update the db in a different thread
2 - Make an asynchronous http call to a web application that update the db
(in this case the web app can be resident in a different machine, so
Hi Vadim,
I attached a zip (solr plugin) file to SOLR-1604. This not a patch. This is
supposed to work with solr 4.0. Some tests fails but it should work with pol*
tel*~5 types of queries.
Ahmet
--- On Thu, 9/27/12, Vadim Kisselmann v.kisselm...@gmail.com wrote:
From: Vadim Kisselmann
Hey Kris
Right now there is no specific Document .. but we could perhaps kind of a
legend on this screen? .. in the meanwhile, does this help?
http://svn.apache.org/viewvc/lucene/dev/trunk/solr/webapp/web/css/styles/cloud.css?view=markup#l259
The used css-classname is what we get from
Hi,
We've been using V4.x of SOLR since last November without too much
trouble. Our MySQL database is refreshed daily and a full import is run
automatically after the refresh and generally produces around 86,000
products, obviously on unique doc_id's.
So, we upgraded to 4.0 Beta a few days
Hi Mikhail,
I read the article and can't see how to solve my problem with FieldCollapsing.
Any other suggestions?
Torben
Am 04.10.2012 um 17:31 schrieb Mikhail Khludnev:
it's a typical nested document problem. there are several approaches. Out
of the box solution as far you need facets is
A legend would be awesome, I'm vastly in favor of not having to go to
external docs.
Tooltip would work too.
whichever is easier...
Best
Erick
On Fri, Oct 5, 2012 at 10:24 AM, Stefan Matheis
matheis.ste...@gmail.com wrote:
Hey Kris
Right now there is no specific Document .. but we could
How are you indexing? There was a problem with indexing from SolrJ
if you indexed documents in batches, server.add(doclist) that's fixed in
4.0 RC#. The work-around is to add docs singly, server.add(doc)
Second thing. Bad Things Happen if you don't have a _version_ field
in your schema.xml. Solr
denormalize your docs to option x value tuples, identify them by duping id.
doc
str name=setid3/str
str name=optionsA/str
str name=value200/str
/doc
doc
str name=setid3/str
str name=optionsB/str
str name=value400/str
/doc
doc
str name=setid3/str
str name=optionsB/str
str
Thanks Erick.
We've added the '_version_' and we'll see if that makes a difference
tomorrow. Also, have downloaded the RC1 and will try that next week.
Regards,
David Q
-Original Message-
From: Erick Erickson [mailto:erickerick...@gmail.com]
Sent: 05 October 2012 15:40
To:
Hi,
I am using EmbeddedSolrServer to access indexed data.
I have to query server around 250K with different query each time.
I have already created queries. But every time querying solr takes time.
As I am querying using threads and loop, but still it's not so fast.
Is there any way to speed
It's working fine on the server.
Problem was at my local PC which might be occurred because of some
misconfiguration.
Thank you very much.
On Fri, Oct 5, 2012 at 11:23 AM, Sushil jain jain.ayushm...@gmail.comwrote:
I am using Solr 1.4.1 and same solr is indexing the documents, I have
I think that's correct, but only when creating a new collection. I don't
know if the replication factor is considered after that (running more nodes
that have a core with the collection name, or manually adding nodes to the
collection), or if some nodes go down.
Also, please someone correct me if
The very first question is what form are your XML docs in?
Solr does NOT index arbitrary XML, so I'm guessing
you're using DIH and some of the xml stuff there. Do note
that the XSLT is a subset of the full capabilities
Second, I'd recommend you just put it all in a single index, it'll be
Here's a reference, much of it is at the Lucene layer, but
it might be helpful.
http://wiki.apache.org/lucene-java/ImproveSearchingSpeed
If I'm reading this right, you want to get through 250K queries.
What kind of throughput are you seeing? What is your target
speed?
I suspect you're going to
hi Shawn,
thanks for the detailed explanation.
I have got one doubt, you said it doesn matter how many segments index have
but then why does solr has this merge policy which merges segments
frequently? why can it leave the segments as it is rather than merging
smaller one's into bigger one?
Right now there is no specific Document .. but we could perhaps kind of a
legend on this screen? .. in the meanwhile, does this help?
http://svn.apache.org/viewvc/lucene/dev/trunk/solr/webapp/web/css/styles/cloud.css?view=markup#l259
The used css-classname is what we get from
Hi Toke,
Were you able to find anything on this issue? We are running at 30 TPS and
using the default HttpSolrServer for the posts.
[cid:image001.png@01CDA2EA.370A6ED0]
Thanks,
Balaji
Balaji Gandhi, Senior Software Developer, Horizontal Platform Services
Product Engineering │ Apollo Group,
because eventually you'd run out of file handles. Imagine a
long-running server with 100,000 segments. Totally
unmanageable.
I think shawn was emphasizing that RAM requirements don't
depend on the number of segments. There are other
resources that file consume however.
Best
Erick
On Fri, Oct 5,
Erick,
I did mention using the DIH to index the first two datasets, that is
where my the root of my problem lies.
I do see the benefit of one index. However the question still
remains, can I use the DIH to index xml from data set 1 and 2, every
15 minutes or so (full index) without wiping out
Thanks a lot for all the replies, Chris it worked out with this mm value:
str name=mm
10%
/str
If this version of solr is affected with the bug you pointed out, shouldn't
fail with this value as well?
Greetings!
On Oct 4, 2012, at 8:48 PM, Jorge Luis Betancourt Gonzalez wrote:
Hi Chris:
: So extracting the attachment you will be able to track down what appens
:
: this is the query that shows the error, and below you can see the latest stack
: trace and the qt definition
Awesome -- exactly what we needed.
I've reproduced your problem, and verified that it has something to do
DIH always gives me indigestion.
Couple of things:
See the 'clean' parameter here for full import:
http://wiki.apache.org/solr/DataImportHandler
it defaults to true. I think if you set it to false
_and_ assuming that your uniqueKey is
defined, it should work OK.
The other approach would be
Hi Everyone,
I am using Solr 3.6. I want to update a single filed value in the index without
re-indexing. Is this possible?
I have google and came across partial update in solr 4.0 BETA.
Can I do do this with Solr 3.6?
Thanks,
-- Pramila Thakur
Using the same unique key doesn't handle documents which disappear from one
indexing to the next.
Instead, add a field for the type of item, like type:animal, type:vegetable, or
type:mineral. Then the query used to clean up before indexing can delete all
items of that type.
wunder
On Oct 5,
Hi,
This is not doable in Solr 3.*. There are Lucene-level patches in
JIRA, but I'm not sure if they are in Solr 4.*
Otis
--
Search Analytics - http://sematext.com/search-analytics/index.html
Performance Monitoring - http://sematext.com/spm/index.html
On Fri, Oct 5, 2012 at 3:02 PM, Thakur,
Looks like HttpClient jar is not in your CLASSPATH or in -cp.
Otis
--
Search Analytics - http://sematext.com/search-analytics/index.html
Performance Monitoring - http://sematext.com/spm/index.html
On Fri, Oct 5, 2012 at 3:33 PM, Prithu Banerjee prid...@gmail.com wrote:
I have been using solrJ
Ok ok thanks a lot Otis. This was bothering me since a long while. Thanks
a ton.
On Sat, Oct 6, 2012 at 1:05 AM, Otis Gospodnetic otis.gospodne...@gmail.com
wrote:
Looks like HttpClient jar is not in your CLASSPATH or in -cp.
Otis
--
Search Analytics -
Could you please tell me more. What field do you need to update, how it
influences the search results, how often, and why you can not afford commit?
On Fri, Oct 5, 2012 at 11:14 PM, Otis Gospodnetic
otis.gospodne...@gmail.com wrote:
Hi,
This is not doable in Solr 3.*. There are Lucene-level
Thank you Erick for your quick response.
Yes, you are right about my problem.
and indexes are on same machine and yes I am using single machine
I am using EmbeddedSolrServer class of SolrJ which removes HTTP layer.
But still it takes time.
On Fri, Oct 5, 2012 at 10:19 PM, Erick Erickson
Balaji,
What is 30 TPS ?
Toke,
You should use EmbeddedSolrServer Instead.
On Fri, Oct 5, 2012 at 11:42 PM, balaji.gandhi
balaji.gan...@apollogrp.eduwrote:
Hi Toke,
Were you able to find anything on this issue? We are running at 30 TPS and
using the default HttpSolrServer for the posts.
Sushil,
30 TPS = 30 transactions (updates) per second.
Is the recommendation to use EmbeddedSolrServer instead of HttpSolrServer?
Thanks,
Balaji
Balaji Gandhi, Senior Software Developer, Horizontal Platform Services
Product Engineering │ Apollo Group, Inc.
1225 W. Washington St. | AZ23 |
Hi Ahmet,
thank you, it sounds great:)
I will test it in the next days and give feedback.
Best regards
Vadim
2012/10/5 Ahmet Arslan iori...@yahoo.com:
Hi Vadim,
I attached a zip (solr plugin) file to SOLR-1604. This not a patch. This is
supposed to work with solr 4.0. Some tests fails but
Hi,
I think you should store this outside of Solr, in a DB or file or
Redis (key is doc ID, value is a query=position map) or ...
Otis
--
Search Analytics - http://sematext.com/search-analytics/index.html
Performance Monitoring - http://sematext.com/spm/index.html
On Fri, Oct 5, 2012 at 5:13
Yes, I'd recommend EmbeddedSolrServer, because it doesn't require any web
server for read/write/update/delete operations.
On Sat, Oct 6, 2012 at 1:48 AM, balaji.gandhi
balaji.gan...@apollogrp.eduwrote:
Sushil,
30 TPS = 30 transactions (updates) per second.
Is the recommendation to use
Already started .. if you want to follow and give feedback :)
https://issues.apache.org/jira/browse/SOLR-3915
On Friday, October 5, 2012 at 7:53 PM, Kristopher Kane wrote:
I also vote for a legend on the monitor.
Hi Eric,
I am in a major dilemma with my index now. I have got 8 cores each around
300 GB in size and half of them are deleted documents in it and above that
each has got around 100 segments as well. Do i issue a expungeDelete and
allow the merge policy to take care of the segments or optimize
Sushil, we are trying to call the VIP in front of the SOLR nodes to distribute
the update load.
Also is EmbeddedSolrServer thread safe?
Balaji Gandhi, Senior Software Developer, Horizontal Platform Services
Product Engineering │ Apollo Group, Inc.
1225 W. Washington St. | AZ23 | Tempe, AZ
If you need to use solr in an embedded application, this is the recommended
approach. It allows you to work with the same interface whether or not you
have access to HTTP.
And it is not thread safe.
On Sat, Oct 6, 2012 at 1:58 AM, balaji.gandhi
balaji.gan...@apollogrp.eduwrote:
Sushil, we are
But look what you're asking Solr to do. 250K queries. Let's say you get 100 QPS,
which for a single box isn't bad. That's still 2,500 seconds, roughly
40 minutes.
But you still haven't told us what QPS you're seeing. Or what you need to see.
Or what kind of results you need from your queries.
My first reaction is you have too much stuff on a single machine. Your
cumulative index size is 2.4 TB. Granted, it's a beefy machine, but still...
And index size isn't all the helpful, as it includes the raw stored data which
doesn't really come into play for sizing things, subtract out the
Well, using embedded Solr isn't necessarily indicated. I have a couple
of questions.
1 you say 30 tps. Are you sending a single doc at a time or batching them
up? I.e. server.add(doclist) or server.add(doc)?
2 Http isn't actually an inefficient protocol, I think the whole idea of
using embedded
Does DIH support only deleting/re-indexing docs of a certain type?
I.E. can I have a DIH for type:vegetable and another for type:mineral
and each only deletes/recreates the right types?
Thanks.
On Fri, Oct 5, 2012 at 1:04 PM, Walter Underwood wun...@wunderwood.org wrote:
Using the same unique
Mr. Miller said that it depends
If you create your collection with the collections api, then replicationFactor
will only see the currently live nodes, not nodes started later.
However, collections added to solr.xml on all nodes, will participate in auto
role assignment for new nodes started. I
Hi Mikhail,
thank you for your answer. Maybe my sample data was a not so god. The document
always have additional data which I need to use as facet like this:
doc
str name=id3/str
str name=attribute_Avalue/str
str name=attribute_Bvalue/str
str name=options
strA/str
strB/str
what will happen if in my query I specify a greater number for rows than the
queryResultWindowSize in my solrconfig.xml
for example, if queryResultWindowSize=100, but I need process a batch query
from solr with rows=1000 each time and vary the start move on... what will
happen? if I do not turn
If I were you and not knowing all your details...
I would optimize indices that are static (not being modified) and
would optimize down to 1 segment.
I would do it when search traffic is low.
Otis
--
Search Analytics - http://sematext.com/search-analytics/index.html
Performance Monitoring -
57 matches
Mail list logo