Hi,
As part of solr results i am able to get the max score.If i want to filter
the results based on the max score, let say the max score is 10 And i need
only the results between max score to 50 % of max score.This max score is
going to change dynamically.How can we implement this?Do we need to
Hi,
I have 25 indexed fields in my document.But by default, if i give
q=laptops this is going to search on five fields and iam getting the score
as part of search results.How solr will calculate the score?Is it going to
calculate only on the five fields or on 25 fields which are indexed?What is
Hi,
In my document i have a filed called category.This contains
electronics,games ,..etc.For some of the category values i need to boost
the document score.Let us say, for electronics category, i will decide the
boosting parameter grater than the games category.Is there any body has
the idea to
As part of solr results i am able to get the max score.If i
want to filter
the results based on the max score, let say the max
score is 10 And i need
only the results between max score to 50 % of max
score.This max score is
going to change dynamically.How can we implement this?Do we
need
Hi
There is a situation where i search for more than 1 keyword my main 2
fields are ad_title ad_description.
I want those results which match all of the keywords in both fields, should
come on top. Then sequentially one by one keyword can be dropped in further
results.
E.g. In a search of 3
I hacked SnapPuller to log the cost, and the log is like thus:
[2010-11-01
17:21:19][INFO][pool-6-thread-1][SnapPuller.java(1037)]readFully1048576 cost 979
[2010-11-01
17:21:19][INFO][pool-6-thread-1][SnapPuller.java(1037)]readFully1048576 cost 4
[2010-11-01
I suspected my app has some sleeping op every 1s, so
I changed ReplicationHandler.PACKET_SZ to 1024 * 1024*10; // 10MB
and log result is like thus :
[2010-11-01
17:49:29][INFO][pool-6-thread-1][SnapPuller.java(1038)]readFully10485760 cost
3184
[2010-11-01
Hm, I do not have a webserver setup for security reasons.I use SVNKit to
connect to SVN via the file:// protocol, what I get then is the
ByteArrayOutputStream.What would the buffer-solution or the DualThread
Writer/Reader pair look like?-Ursprüngliche Nachricht-
Von: Lance Norskog
Ok, so if I did NOT use Solr_J I could PUSH a Stream to Solr somehow?
I do not depend on Solr_J, any connection-method would suffice.
On 11/01/2010 03:23 AM, Lance Norskog wrote:
2.
The SolrJ library handling of content streams is pull, not push.
That is, you give it a reader and it pulls
Ok i imagined that the double linked list would be far too complicated for
solr.
Now, how can i achieve that solr connects to a webservice and do the import?
I'm sorry if i'm not clear, sometimes my english gets fuzzy :P
On Fri, Oct 29, 2010 at 4:51 PM, Yonik Seeley
Hi,
Yes, sometimes it takes 5 minutes for a query. I agree this is not desirable.
However, if the application has no control over the input queries other that
closing the socket after a while, solr should not continue writing the
response, but terminate the thread.
In general, is there a way
Hello,
With solr example, using facet.field=text creates UnInvertedField
for the text field in fieldValueCache. After that, I saw stats page
and I was surprised at counters in *filterCache* were up:
lookups : 213
hits : 106
hitratio : 0.49
inserts : 107
evictions : 0
size : 107
warmupTime : 0
2010/11/1 Koji Sekiguchi k...@r.email.ne.jp:
With solr example, using facet.field=text creates UnInvertedField
for the text field in fieldValueCache. After that, I saw stats page
and I was surprised at counters in *filterCache* were up:
Do they cause of big words in UnInvertedField?
Yes.
Yonik,
Thank you for your reply. I just wanted to share my surprise. :)
Koji
--
http://www.rondhuit.com/en/
(10/11/01 23:17), Yonik Seeley wrote:
2010/11/1 Koji Sekiguchik...@r.email.ne.jp:
With solr example, using facet.field=text creates UnInvertedField
for the text field in
Here's a good place to start:
http://search.lucidimagination.com/search/out?u=http://lucene.apache.org/java/2_4_0/scoring.html
http://search.lucidimagination.com/search/out?u=http://lucene.apache.org/java/2_4_0/scoring.htmlBut
what do you mean this is going to search on five fields? This
Would simple boosting work? As in category:electronics^2?
If not, perhaps you can explain a bit more about what you're trying to
accomplish...
Best
Erick
On Sun, Oct 31, 2010 at 10:55 PM, sivaprasad sivaprasa...@echidnainc.comwrote:
Hi,
In my document i have a filed called category.This
I'm not sure this exactly fits your use-case, but it may come
close enough. Have you looked at disMax and the mm parameter
(minimum should match)?
Best
Erick
On Mon, Nov 1, 2010 at 5:00 AM, Pawan Darira pawan.dar...@gmail.com wrote:
Hi
There is a situation where i search for more than 1
I'm going to nudge you in the direction of understanding why the queries
take so long in the first place rather than going toward the blunt approach
of cutting them off after some time. The fact that you don't control the
queries submitted doesn't prevent you from trying to understand what
is
We are trying to solve some multilingual issues with our Solr analysis filter
chain and would like to use the new Lucene 3.x filters that are Unicode
compliant.
Is it possible to use the Lucene ICUTokenizerFilter or StandardAnalyzer with
UAX#29 support from Solr?
Is it just a matter of
On Mon, Nov 1, 2010 at 12:24 PM, Burton-West, Tom tburt...@umich.edu wrote:
We are trying to solve some multilingual issues with our Solr analysis filter
chain and would like to use the new Lucene 3.x filters that are Unicode
compliant.
Is it possible to use the Lucene ICUTokenizerFilter or
I think you guys are talking about two different kinds of 'virtual
hosts'. Lance is talking about CPU virtualization. Eric appears to be
talking about apache virtual web hosts, although Eric hasn't told us how
apache is involved in his setup in the first place, so it's unclear.
Assuming you
I'm trying to exclude certain facet results from a facet query. It
seems to work but rather than being excluded from the facet list its
returned with a count of zero.
Ex:
q=(-foo:bar)facet=truefacet.field=foofacet.sort=idxwt=jsonindent=true
This returns bar with a count of zero. All the
Hey guys,
I have a solr index where i store information about experts from
various fields. The thing is when I search for channel marketing i
get people that have the word channel or marketing in their data. I
only want people who have that entire phrase in their bio. I copy the
contents of bio
On Mon, Nov 1, 2010 at 12:55 PM, Tod listac...@gmail.com wrote:
I'm trying to exclude certain facet results from a facet query. It seems to
work but rather than being excluded from the facet list its returned with a
count of zero.
If you don't want to see 0 counts, use facet.mincount=1
I was speaking about apache virtual hosts. I was concerned that there was an
increase processing time due to the solr and nutch instance being housed inside
a virtual host as opposed to being dropped in root of my distro.
Thank you for the astute clarification.
-Original Message-
From:
Take a look at term proximity and phrase query.
http://wiki.apache.org/solr/SolrRelevancyCookbook
Hey guys,
I have a solr index where i store information about experts from
various fields. The thing is when I search for channel marketing i
get people that have the word channel or marketing
Thanks Robert,
I'll use the workaround for now (using StandardTokenizerFactory and specifying
version 3.1), but I suspect that I don't want the added URL/IP address
recognition due to my use case. I've also talked to a couple people who
recommended using the ICUTokenFilter with some rule
Mark,
I have the same question so I did a little research on this. Not a complete
answer but here is what I've found:
- threads was aded with SOLR-1352
(https://issues.apache.org/jira/browse/SOLR-1352).
- Also see
Guys, the string type did the trick :)
Thanks
--
View this message in context:
http://lucene.472066.n3.nabble.com/indexing-tp1816969p1823199.html
Sent from the Solr - User mailing list archive at Nabble.com.
On Mon, Nov 1, 2010 at 1:34 PM, Burton-West, Tom tburt...@umich.edu wrote:
Thanks Robert,
I'll use the workaround for now (using StandardTokenizerFactory and
specifying version 3.1), but I suspect that I don't want the added URL/IP
address recognition due to my use case. I've also talked
Hi,
I'm pretty much of a Solr newbie currently packaging solrpy for Debian;
see
http://svn.debian.org/viewsvn/python-modules/packages/python-solrpy/trunk/
In order to run solrpy's supplied tests at build time, I'd need Solr to
know about the schema.xml that comes with the tests.
Can anyone tell
On 11/1/2010 1:03 PM, Yonik Seeley wrote:
On Mon, Nov 1, 2010 at 12:55 PM, Todlistac...@gmail.com wrote:
I'm trying to exclude certain facet results from a facet query. �It seems to
work but rather than being excluded from the facet list its returned with a
count of zero.
If you don't want
: References: aanlktimvv5foc2b=gxo+xs1zwgps9o5t5jorwv3id...@mail.gmail.com
: aanlktim30aat8s0nxq_8utxcokv8myyabz8wtxeyl...@mail.gmail.com
: aanlktimpo9v_krgaxomd4hocqabibgzdhc+jhhgsq...@mail.gmail.com
: aanlktimdvaawj7=b7=pgu+rzm+nobvzdfh4o39nkp...@mail.gmail.com
:
Hi,
I think I have seen a comment on the list from someone with the same need a few
months ago.
He planned to make a new fieldType to support this, e.g. MinMaxRangeFieldType
which would
be a polyField type holding both a min and max value, and then you could query
it
q=myminmaxfield:123
I did
I don't think you read the entire thread. I'm assuming you made a mistake.
-Original Message-
From: Chris Hostetter [mailto:hossman_luc...@fucit.org]
Sent: Monday, November 01, 2010 11:49 AM
To: solr-user@lucene.apache.org
Subject: Re: Solr in virtual host as opposed to /lib
:
No, he didn't make a mistake but you did. Next time, please start a new thread
not by conveniently replying to an existing thread and just changing the
subject. Now we have two threads in thread. :)
I don't think you read the entire thread. I'm assuming you made a mistake.
-Original
: I don't think you read the entire thread. I'm assuming you made a mistake.
No mistake. When you sent your first message with the subject Solr in
virtual host as opposed to /lib you did so in response to a completely
unrelated thread (Searching with wrong keyboard layout or using
translit)
my index is 13M big and i have not index all of my documents. the index in
production system should be about 30M Documents big.
so with my test 13M Index i try a search over all documents, with
first query: q:[2008-10-27 12:23:00:00 TO 2009-04-29 23:59:00:00]
than i run the next query, for
I took a swag at applying SOLR-1873 to branch_3x. It applied mostly, most
of the rest of the issues where Zookeeper integrations, and those
appliedly cleanly by hand.
There were also a few constants and such that need to be pulled in from trunk.
At the moment, it passes all the tests. I have
It is useful for parsing PDFs on a multi-processor machine. Also, if a
sub-entity does an outbound I/O call to a database, a file, or another
SOLR (SOLR-1499).
Anything where the pipeline time outweighs disk i/o time.
Threading happens on a per-document level- there is no concurrent
access
Besides, I don't know how you'd stop Solr processing a query mid-way
through,
I don't know of any way to make that happen.
The timeAllowed parameter causes a timeout in the Solr server to kill
the searching thread. They uses that now.
But, yes, Erick is right- there is a fundamental problem
Yes, you can write your own app to read the file with SVNkit and post
it to the ExtractingRequestHandler. This would be easiest.
On Mon, Nov 1, 2010 at 5:49 AM, getagrip getag...@web.de wrote:
Ok, so if I did NOT use Solr_J I could PUSH a Stream to Solr somehow?
I do not depend on Solr_J, any
If you just want a quick way to query Solr server, Perl module
Webservice::Solr is pretty good.
On Mon, Nov 1, 2010 at 4:56 PM, Lance Norskog goks...@gmail.com wrote:
Yes, you can write your own app to read the file with SVNkit and post
it to the ExtractingRequestHandler. This would be
Careful here. First searches are known to be slow, various caches
are filled up the first time they are used etc. So even though you're
measuring the second query, it's still perhaps filling caches.
And what are you measuring? The raw search time or the entire response
time? These can be quite
My documents have a down_vote field. Every time a user votes down a document,
I increment the down_vote field in my database and also re-index the document
to Solr to reflect the new down_vote value.
During searches, I want to restrict the results to only documents with, say
fewer than 3
From the user perspective I wouldn't delete it, because it could be
that down-voting by mistake or spam or something and up-voting can
resurrect it.
It could be also wise to keep the docs to see which content (from which
users?) are down voted to get spam accounts?
From the dev perspective
Just deleting a document is faster because all that really happens
is the document is marked as deleted. An update is really
a delete followed by an add of the same document, so by definition
an update will be slower...
But... does it really make a difference? How often to you expect this to
The actual time it takes to delete or update the document is unlikely to
make a difference to you.
What might make a difference to you is the time it takes to actually
finalize the commit, and the time it takes to re-warm your indexes after
a commit, and especially the time it takes to run
It's not looking very promising, but is there something I'm missing to be able
to apply a field boost from within a transformer in the DataImportHandler? Not
a boost defined within the schema, but a boost applied to the field from the
transformer itself.
I know you can do a document boost, but
We've been trying to get a setup in which a slave replicates from a
master every few seconds (ideally every second but currently we have it
set at every 5s).
Everything seems to work fine until, periodically, the slave just stops
responding from what looks like it running out of memory:
This is the time to replicate and open the new index, right? Opening a
new index can take a lot of time. How many autowarmers and queries are
there in the caches? Opening a new index re-runs all of the queries in
all of the caches.
2010/11/1 kafka0102 kafka0...@163.com:
I suspected my app has
You should query against the indexer. I'm impressed that you got 5s
replication to work reliably.
On Mon, Nov 1, 2010 at 4:27 PM, Simon Wistow si...@thegestalt.org wrote:
We've been trying to get a setup in which a slave replicates from a
master every few seconds (ideally every second but
I have a number of fields I need to do an exact match on. I've defined
them as 'string' in my schema.xml. I've noticed that I get back query
results that don't have all of the words I'm using to search with.
For example:
On Mon, Nov 1, 2010 at 10:26 PM, Tod listac...@gmail.com wrote:
I have a number of fields I need to do an exact match on. I've defined
them as 'string' in my schema.xml. I've noticed that I get back query
results that don't have all of the words I'm using to search with.
For example:
how about a timrstamp with either a GUID appended on the end of it?
Dennis Gearon
Signature Warning
It is always a good idea to learn from your own mistakes. It is usually a
better idea to learn from others’ mistakes, so you do not have to make them
yourself. from
Scenario:
Git update to current trunk (Nov 1, 2010).
Build all
Run solr in trunk/solr/example with 'java -jar start.jar'
Hi ^C
Jetty reports doing shutdown hook
There is now a data/index with a write lock file in it. I have not
attempted to read the index, let alone add something to it.
I start
56 matches
Mail list logo