I'm surprised I haven't seen a post on this, but maybe the answers are
obvious. I'm using a cursor to page the results. If I want to enable reverse
paging (go back one page) I have to either:
a) Keep a map of all cursor marks the user made paging forward. This map
could get very long if a user
Hello,
I did not know what the right mailing list would be (java-user vs solr-user),
so mailing both.
My group uses solr/lucene, and we have custom collectors.
I stumbled upon the implementation of SolrIndexSearcher.java and saw this :
Hi - MoreLikeThis is not based on cosine similarity. The idea is that rare
terms - high IDF - are extracted from the source document, and then used to
build a regular Query(). That query follows the same rules as regular queries,
the rules of your similarity implemenation, which is TFIDF by
Hi Folks,
Can one one you share the shell script or any script in other language to
spin up a new solr node deployed in tomcat
with most of the configs from zookeepers and some from svn, with some
defaults values.
#some default directory
${solrDataDir} =/opt
#some host name
Dear Erik,
Thank you for your response. Would younplease tell me why this score could
be higher than 1? While cosine similarity can not be higher than 1.
On Feb 2, 2015 7:32 PM, Erik Hatcher erik.hatc...@gmail.com wrote:
The scoring is the same as Lucene. To get deeper insight into how a score
You don't have to use SolrJ. It's just a web request to a url, so just
issue the request in Java and parse the JSON response.
http://stackoverflow.com/questions/7467568/parsing-json-from-url
SolrJ does make it simpler, however.
Jim
On 2/2/15, 12:57 PM, mathewvino vinojmat...@hotmail.com wrote:
There was a similar discussion recently. I think the conclusion was that if
your users have to page through 5 millions of results, you have a bigger
problem on your hands than storing the page marks.
You could store them on the client side.
Regards,
Alex
On 02/02/2015 4:08 pm, tedsolr
The problem is that if you want only docs 200-250, how do you know whether
any particular
doc will wind up in in positions 0-199? You process a doc and find it's
score is X. That has
no relation to the score of the _next_ doc you score. Or the previous one
for that matter.
So to find the doc in
Hi All,
Is there a way to delete a value from a Multi-value field without reindexing
anything?
Lets say I have three documents A,B and C with field XYZ set to 1,2,3,
2,3,4 and 1. I'd like to remove anything that has the value '1' in the
field XYZ. That is I want to remove the value '1' from
Have a look here:
https://wiki.apache.org/solr/SimpleFacetParameters#Multi-Select_Faceting_and_LocalParams;
it might answer your question. Typically what I recommend is to keep
the selected facet in view, but without any limitation on its counts.
However if you want to hide it altogether, I
You do not have WordDelimiterFilterFactory in your index-time
analysis chain. And you're using different tokenizers in the two
cases. This will almost certainly lead to surprising results
unless you completely and thoroughly understand all the nuances
here.
I _strongly_ recommend you do not do
Hi,
I'm thinking about having an instance of solr (SolrA) with all fields
stored and just id indexed in addition with a normal production instance of
solr (SolrB) that is used for the searches.
This would allow me to read only what changed from previous crawl, update SolrA
and send the
Hi,
I have the output below on one part of my debugQuery. First I would like to
know why the highlighted part happens. Does it mean that there are multiple
matches on synonyms on the field txtmysite?
Is it possible to somehow chage the sum of operation to a max of. I
already tweaked the idf
Hi all,
I have two SOLR shards of about 20 GB (server1_2015_02_01 and
server2_2015_02_01 are the shard names, representing core 2015_02_01). For
historical reasons we are not using replication, instead we are writing from a
source to each of the shards.
I've found that not all the rows have
Hi Erick,
Thanks for your reply!
I totally understand that each shard should have a different name and each
replica should too. But I want user6 to be the name of the collection.
Similar to how we have collection 1 in the quick start. I'm hoping to set
up one collection per user, which may span
Cool.
For your information, there are multiple existing Solr proxies out there, one
of them being mr. Smiley's one in Java. Also in PHP, Node etc. Here is one
link, there are others as well
https://github.com/evolvingweb/ajax-solr/wiki/Solr-proxies
--
Jan Høydahl, search solution architect
Hi,
My solr logs directory has been get high. It is seriously problem
or It harms my solr performance in both cases indexing as well as searching.
I have a similar use-case. Check out the export capability and using
cursorMark.
-Joe
On 2/2/2015 8:14 AM, Matteo Grolla wrote:
Hi,
I'm thinking about having an instance of solr (SolrA) with all fields
stored and just id indexed in addition with a normal production instance of
solr
https://issues.apache.org/jira/browse/SOLR-7069
When querying a collection with a core in down state, if we send the
request to the server containing the down core, while the server is
active, it cannot failover to the good replica of same shard on another
server.
The steps to make a core down
1 is not too small a value, in fact, it’s the default value. Of course the
more combinations it has to try, the slower it will run, but the penalty is
small enough you're not going to notice. The only problem you might have is if
you use a lot of 1-character stop-words, you might get these
Sorry, that feature is not available in Solr at this time. You could
implement an update processor which copied only the desired input field
values. This can be done in JavaScript using the script update processor.
-- Jack Krupansky
On Mon, Feb 2, 2015 at 2:53 AM, danny teichthal
Hi,
I forget to add link about discountOverlaps:
https://lucene.apache.org/core/4_3_0/core/org/apache/lucene/search/similarities/DefaultSimilarity.html#discountOverlaps
ahmet
On Monday, February 2, 2015 3:42 PM, Ahmet Arslan iori...@yahoo.com.INVALID
wrote:
Hi Bruno,
Recently I observed
Hi,
I want to analyse a document and get his tokens and the frequency without
committing it to the index
I found on google the Document Analysis Handler that can do that but I don't
know how to use it
Can someone help me to use it or redirect me to some tutorials
I am using ruby
regards,
--
Hi Bruno,
Recently I observed the same thing. If a query term is expanded into multiple
terms (at the same position) in analysis chain, contribution of subqueries are
summed. This behaviour boosts expanded terms and may not be always desired.
I think there should be a flag/switch which is
Hi,
I want to tokenize query like CHQ PAID-INWARD TRAN-HDFC LTD in such a way
that it should give me result documnet containing HDFC LTD and not HDFC MF.
How can I do this.
I Have already applied below Tokenizers
fieldType name=text_general class=solr.TextField
positionIncrementGap=100
I have 8 node solr cloud cluster connected with external zookeeper. Each node
: 30 Gb, 4 core.
I have created around 100 collections, each collection is having approx. 30
shards. (Why I need it, let be a different story, business isolation,
business requirement could be anything).
Now, I am
I really doubt you want to do this. It's perfectly possible to host
multiple replicas
or multiple shards on the same Solr. So if you name the shards all user6,
how
would they be distinguished?
Best,
Erick
On Mon, Feb 2, 2015 at 12:31 AM, Avanish Raju yar...@gmail.com wrote:
Hi all,
I'm
Using the /suggest handler wired to SuggestComponent, the SpellCheckResponse
objects are not populated.
Reason is that QueryResponse looks for a top-level element named spellcheck
else if ( spellcheck.equals( n ) ) {
_spellInfo = (NamedListObject) res.getVal( i );
Hi - you can use the MLT query parser in Solr 5.0 or patch 4.10.x
https://issues.apache.org/jira/browse/SOLR-6248
-Original message-
From:Tim Hearn timseman...@gmail.com
Sent: Saturday 31st January 2015 0:31
To: solr-user@lucene.apache.org
Subject: Hit Highlighting and More Like
Which 20GB, You are asking?
Indexed data size is 2.(dot)04GB.
I checked index data using sudo du -hs
test_azure_mapping/nitin/solr/node*/solr/wikingram_shard*
On Mon, Feb 2, 2015 at 2:54 PM, Toke Eskildsen t...@statsbiblioteket.dk
wrote:
On Mon, 2015-02-02 at 09:59 +0100, Nitin Solanki wrote:
On Mon, 2015-02-02 at 10:36 +0100, Nitin Solanki wrote:
Which 20GB, You are asking?
Indexed data size is 2.(dot)04GB.
Yes, but at
http://stackoverflow.com/questions/28273340/solr-ate-all-memory-and-throws-bash-cannot-create-temp-file-for-here-document
you stated that Solr allocated 22GB RAM.
Before stopping solr, I checked the space using free -m
free = 5000 approx. out of 28000
After stopping solr, Again checked the space using free -m
Then, it shows - free = 2 approx of 28000
Sorry for 22GB. It is actually 15GB.
Any help Toke
On Mon, Feb 2, 2015 at 3:29 PM, Toke Eskildsen
Hello,
From the one side Lucene has DisjunctionMaxQuery (not default summing
SHOULD behavior). However, Solr's disMax doesn't work like this - it max-es
a word across fields, but sums words. It seems like you need to develop own
QParser. or you can try to mimic the necessary scoring by the
Why have you created ngram of size 3? Do you want match also in case of
spell-mistakes?
If you want 2 consecutive tokens to match, you can create shingles. Please
refer to link
https://cwiki.apache.org/confluence/display/solr/Filter+Descriptions#FilterDescriptions-ShingleFilter
Thanks,
Dikshant
I developed a max-score query parser for a customer some time ago, and we gave
it back.
This should be what you're looking for:
https://cwiki.apache.org/confluence/display/solr/Other+Parsers#OtherParsers-MaxScoreQueryParser
--
Jan Høydahl, search solution architect
Cominvent AS -
Hi All,
I am using mysql dataimport handler to index documents from database. I
have stored news articles in database. I have done following changes in
dataimport
requestHandler name=/dataimport
class=org.apache.solr.handler.dataimport.DataImportHandler
lst name=defaults
str
Hi Michael Della and Michael Sokolov,
*size of tlog :-*
56K/mnt/nitin/solr/node1/solr/wikingram_shard3_replica1/data/tlog/
56K/mnt/nitin/solr/node1/solr/wikingram_shard7_replica1/data/tlog/
56K/mnt/nitin/solr/node2/solr/wikingram_shard4_replica1/data/tlog/
52K
Hi Jean,
Please see the issues
https://issues.apache.org/jira/browse/SOLR-3862
https://issues.apache.org/jira/browse/SOLR-5992
Both of them are resolved. The *remove *clause (atomic update) has been
added to 4.9.0 release. Haven't checked it though.
Thanks,
Lokesh
On Tue, Feb 3, 2015 at 7:26
Hi Michael Della and Michael Sokolov,
*size of tlog :-*
56K/mnt/nitin/solr/node1/solr/wikingram_shard3_replica1/data/tlog/
56K/mnt/nitin/solr/node1/solr/wikingram_shard7_replica1/data/tlog/
56K/mnt/nitin/solr/node2/solr/wikingram_shard4_replica1/data/tlog/
52K
Conceptually, your understanding is correct about VSM cosine similarity.
In text analysis, the range is 0 to 1 as there is no negative similarity.
The scores for handler which internally use Lucene's cosine similarity can
also go beyond 1. The reason being these scores are computed for each
Hi,
I was wondering what is the range of score is brought by more like this
query in Solr? I know that the Lucene uses cosine similarity in vector
space model for calculating similarity between two documents. I also know
that cosine similarity is between -1 and 1 but the fact that I dont
Not on copyField,
You can use UpdateRequestProcessor instead (
http://www.solr-start.com/javadoc/solr-lucene/org/apache/solr/update/processor/CloneFieldUpdateProcessorFactory.html
).
This allows to specify both inclusion and exclusion patterns.
Regards,
Alex.
Sign up for my Solr
Hi Ahmet,
You completely summed up my problem. That flag/control would be great.
Thanx
Bruno
On Mon, Feb 2, 2015 at 1:36 PM, Ahmet Arslan iori...@yahoo.com.invalid
wrote:
Hi Bruno,
Recently I observed the same thing. If a query term is expanded into
multiple terms (at the same position) in
I had a discussion with @search_mb about this on IRC, and he explained how
my collection query would still work with user6, though we couldn't
resolve why the solr Core Admin page doesn't show the collection name as
user6.
Detailed chat log follows:
Combining
https://cwiki.apache.org/confluence/display/solr/Using+Solr+From+Ruby with
https://lucene.apache.org/solr/4_2_0/solr-core/org/apache/solr/handler/FieldAnalysisRequestHandler.html,
here’s a Ruby example. I used the *field* analysis request handler as that
is perhaps more likely what
Hi Chris,
Thanks so much for your reply and clarification. I was able to get the
collection search working as expected. :)
Also sharing the following JIRA issue for fixes/improvements to Solr Admin
for SolrCloud-based things, shared by @elyagrog on IRC:
This is difficult to diagnose, but here¹s some questions I would ask
myself:
Can you reliably recreate the error?
Can you recreate the error faster by writing to all 100 collections at
once?
Can you recreate the error faster if I have less nodes?
Is just one solr node or one solr collection
Hi,
I'm am wondering if anyone can point me to a website that user Solr's
Suggester or Autocomplete or whatever you call it. I am looking for
something that is closer to the default provided in the examples, but is
also used commercially.
I have a local Solr installation that is
Please go ahead and play with autocomplete on safaribooksonline.com/home
- if you are not a subscriber you will have to sign up for a free
trial. We use the AnalyzingInfixSuggester. From your description, it
sounds as if you are building completions from a field that you also use
for
The scoring is the same as Lucene. To get deeper insight into how a score is
computed, use Solr’s debug=true mode to see the explain details in the response.
Erik
On Feb 2, 2015, at 10:49 AM, Ali Nazemian alinazem...@gmail.com wrote:
Hi,
I was wondering what is the range of score
It looks to me like you simply want to split the incoming query by the
hyphen, so that it searches for exact codes like this ³CHQ PAID² ³INWARD
TRAN² ³HDFC LTD².
If that¹s true, I¹d either just change the query at the client to do what
you want, or look into something like the PatternTokenizer:
: I had a discussion with @search_mb about this on IRC, and he explained how
: my collection query would still work with user6, though we couldn't
: resolve why the solr Core Admin page doesn't show the collection name as
: user6.
Core Admin pages in the UI are still specific to *CORES* ... no
I was tempted to suggest rehab -- but seriously it wasn't clear if Nitin
meant the log files Michael is referring to, or the transaction log
(tlog). If it's the transaction log, the solution is more frequent hard
commits.
-Mike
On 2/2/2015 11:48 AM, Michael Della Bitta wrote:
If you'd like
: Because they have different potential authors, the two systems now serve
: different purposes.
:
: There are still some pages on the MoinMoin wiki that contain
: documentation that should be in the reference guide, but isn't.
:
: The MoinMoin wiki is still useful, as a place where users can
If you'd like to reduce the amount of lines Solr logs, you need to edit the
file example/resources/log4j.properties in Solr's home directory. Change
lines that say INFO to WARN.
Michael Della Bitta
Senior Software Engineer
o: +1 646 532 3062
appinions inc.
“The Science of Influence Marketing”
On 2 February 2015 at 11:26, O. Olson olson_...@yahoo.it wrote:
I also know that I do not have the
capability to do a lot of customizations to Solr that are much beyond the
defaults and changing a few settings.
Actually, you have a capability to do unbelievable level of
customization in Solr,
Good call, it could easily be the tlog Nitin is talking about.
As for which definition of high, I was making assumptions as well. :)
Michael Della Bitta
Senior Software Engineer
o: +1 646 532 3062
appinions inc.
“The Science of Influence Marketing”
18 East 41st Street
New York, NY 10017
Thank you Michael. I will look at safaribooksonline.com later today when I
create my account.
I am not sure how to use AnalyzingInfixSuggester. I googled a bit, and I can
find the source code, but not how to use it.
You are perfectly correct when you say that I am using a field also used for
Hi
Could you please suggest how to exclude selected filter from solr search
result.
For example in below screenshot, I have selected filter camera but still
camera (1) is returned in search response. How can I request solr to remove
selected filter from search result?
Thanks in advance.
Alexandre Rafalovitch wrote
Actually, you have a capability to do unbelievable level of
customization in Solr, starting from schema definition and down to
writing custom components in Java. Or even completely rebuilding Solr
the way you want from sources. Or was that a reference to your
Umang,
I believe this mailing list strips images. You might have better luck
uploading your image to a 3rd party hosting site and providing a link.
Thanks,
Mike
On Mon, Feb 2, 2015 at 12:35 PM, Umang Agrawal umang.i...@gmail.com wrote:
Hi
Could you please suggest how to exclude selected
Solr scale toolkit should be a good option for you when it comes to
deploying/managing Solr nodes in a cluster.
It has a lot of support for stuff like spinning up new nodes, stopping,
patching, rolling restart etc.
About not knowing python, as is mentioned in the README, you don't really
need to
From memory: there are different methods in SolrIndexSearcher for reason. It
has to do with paging and sorting. Whenever you sort on a simple field, you
can easily start at a specific offset. The problem with sorting on score, is
that score has to be calculated for all documents matching query.
Wow!!!
thanks Joe!
Il giorno 02/feb/2015, alle ore 15:05, Joseph Obernberger ha scritto:
I have a similar use-case. Check out the export capability and using
cursorMark.
-Joe
On 2/2/2015 8:14 AM, Matteo Grolla wrote:
Hi,
I'm thinking about having an instance of solr
Hoss et. al,
I'm not intending on contributing documentation in any immediate sense (the
disclaimer), but I thank you all for the clarification.
It makes some sense to require a committer to review each suggested piece
of official documentation, but I wonder abstractly how a non-committer then
Hi There,
I am using solrj API to make call to Solr Server with the data that I am
looking for. Basically I am using
solrj api as below to get the data. Everything is working as expected
HttpSolrServer solr = new
HttpSolrServer(http://server:8983/solr/collection1;);
SolrQuery query = new
I have been started *solr* for *long time approx 2 weeks* then I saw that
Solr ate around *22 GB* from *28 GB RAM* of my *Server*.
Please check explanation of error on
http://stackoverflow.com/questions/28273340/solr-ate-all-memory-and-throws-bash-cannot-create-temp-file-for-here-document
Hi all,
I'm learning to create collections by http for a new solr instance. To
create a new collection called *user6*, I tried the following:
http://104.154.50.127:8983/solr/admin/collections?action=CREATEname=*user6*
numShards=1replicationFactor=2property.instanceDir=*user6*property.name=
Hi,
My name is Sergio Garcia.
I would be interested in this role. Attached you can find a copy of my CV.
Regards,
Sergio
On 31 January 2015 at 14:18, MKGoose m...@monkeygoose.com wrote:
We are looking for a remote / freelance consultant to work with us on a
project related to Solr faceted
Hi all,
I'm learning to create collections by http for a new solr instance. To
create a new collection called *user6*, I tried the following:
http://104.154.50.127:8983/solr/admin/collections?action=CREATEname=*user6*
numShards=1replicationFactor=2property.instanceDir=*user6*property.name=
On Mon, 2015-02-02 at 09:11 +0100, Nitin Solanki wrote:
I have been started *solr* for *long time approx 2 weeks* then I saw that
Solr ate around *22 GB* from *28 GB RAM* of my *Server*.
I am guessing your index is about 20GB? Solr's MMapDirectory allocates
virtual memory equal to the size of
Ran this command *ulimit -a *and found *open files (-n) 1024*
size of Xmx = 2048MB
Total Indexed data = 2.04GB
I checked my solrconfig.xml but didn't find any value of
*MMapDirectory. *I don't
know the default size of
MMapDirectory.
How to increase the size of open files?
On Mon, Feb 2, 2015 at
On Mon, 2015-02-02 at 09:59 +0100, Nitin Solanki wrote:
Ran this command ulimit -a and found open files (-n) 1024
That explains your error message.
size of Xmx = 2048MB
Total Indexed data = 2.04GB
That was surprising. Solr allocation should not reach 20GB with that
setup. How did you get
73 matches
Mail list logo