This is indeed an interesting idea so to speak, but I think it's a bit too
manual, so to speak, for our use case. I do see it would solve the problem
though, so thank you for sharing it with the community! :)
-Original Message-
From: James Thomas [mailto:jtho...@camstar.com]
Sent: 15.
Hi Alex,
Yes this makes sense. My Java is a bit dusty, but depending on how much in need
we will become at this feature, it's definitely something we will look into
creating, and if successful, we will definitely be submitting a patch. Thank
you for your time and detailed answer!
Best
Hi,
You should use CoreAdmin API (or Solr Admin page) and UNLOAD unneeded
cores. This will unregister them from the zookeeper (cluster state will be
updated), so they won't be used for querying any longer. Solrcloud restart
is not needed in this case.
Regards.
On 16 July 2013 06:18, Ali, Saqib
I am using solr 4.3 and have 2 collections coll1, coll2.
After searching in coll1 I get field1 values which is a comma separated list
of strings like, val1, val2, val3,... valN.
How can I use that list to match field2 in coll2 with those values separated
by an OR clause.
So i want to return all
Hello Manasi,
Have a look at Solr pseudo joins http://wiki.apache.org/solr/Join
Regards
On Jul 16, 2013 9:54 AM, smanad sma...@gmail.com wrote:
I am using solr 4.3 and have 2 collections coll1, coll2.
After searching in coll1 I get field1 values which is a comma separated
list
of strings
Hi,
I have a problem (wonder if it is possible to solve it at all) with the
following query. There are documents with a field which contains a text and
a number in brackets, eg.
myfield: this is a text (number)
There might be some other documents with the same text but different number
in
IMHO the number(s) should be extracted and stored in separate columns in
SOLR at indexing time.
--
Oleg
On Tue, Jul 16, 2013 at 10:12 AM, Marcin Rzewucki mrzewu...@gmail.comwrote:
Hi,
I have a problem (wonder if it is possible to solve it at all) with the
following query. There are
Hi Oleg,
It's a multivalued field and it won't be easier to query when I split this
field into text and numbers. I may get wrong results.
Regards.
On 16 July 2013 09:35, Oleg Burlaca oburl...@gmail.com wrote:
IMHO the number(s) should be extracted and stored in separate columns in
SOLR at
Alex, I am a beginner and I find it a really good idea. A new forum
dedicated to understanding the features rather the missings would allow
newcomers to post questions avoiding to mess up with solr-user list
where people are already expert practitioners and prefer to see more
targeted topics.
Ah, you mean something like this:
record:
Id=10, text = this is a text N1 (X), another text N2 (Y), text N3 (Z)
Id=11, text = this is a text N1 (W), another text N2 (Q), third text (M)
and you need to search for: text N1 and X B ?
How big is the core? the first thing that comes to my mind,
Hi Eric and everybody else!
Thanks for trying to help. Here is the example:
.../terms?terms.regex.flag=case_insensitiveterms.fl=suggestterms=trueterms.limit=20terms.sort=indexterms.prefix=1n1187
returns
int name=1n11871/int
int name=1n1187a1/int
int name=1n1187r1/int
int name=1n1187ra1/int
By multivalued I meant an array of values. For example:
arr name=myfield
strtext1 (X)/str
strtext2 (Y)/str
/arr
I'd like to avoid spliting it as you propose. I have 2.3mn collection with
pretty large records (few hundreds fields and more per record). Duplicating
them would impact performance.
As I said, if I change it in context.xml it works... but the question is...
how to make it from commandline, without modyfing config files.
Thanks
--
View this message in context:
http://lucene.472066.n3.nabble.com/How-to-change-extracted-directory-tp4078024p4078284.html
Sent from the Solr -
Hello list,
Following the answer by Jaendra here:
http://stackoverflow.com/questions/14516279/how-to-add-collections-to-solr-core
Sorry, hit send too fast..
picking up:
from the answer by Jayendra on the link, collections and cores are the same
thing. Same is seconded by the config:
cores adminPath=/admin/cores defaultCoreName=collection1
host=${host:} hostPort=${jetty.port:8983}
hostContext=${hostContext:solr}
First, when switching subjects please start a new thread. It gets
confusing to have multiple topics, it's called thread hijacking.
Second, I have no clue why your Nutch output is outputting
invalid characters. Sounds like
1 your custom plugin is doing something weird
or
2 something you could
I used the reload command to apply changes in synonyms.txt for example, but
with the new mechanisme https://wiki.apache.org/solr/CoreAdmin#LiveReload
this will not work anymore.
Is there another way to reload config files instead of restarting Solr?
--
View this message in context:
Roman:
Did this ever make into a JIRA? Somehow I missed it if it did, and this would
be pretty cool
Erick
On Mon, Jul 15, 2013 at 6:52 PM, Roman Chyla roman.ch...@gmail.com wrote:
On Sun, Jul 14, 2013 at 1:45 PM, Oleg Burlaca oburl...@gmail.com wrote:
Hello Erick,
Join performance is
Garbage in, garbage out G
Your indexing analysis chain is breaking up the tokens via the
EdgeNgramTokenizer and _putting those values in the index_.
Then the TermsComponent is looking _only_ at the tokens in
the index and giving you back exactly what you're asking for.
So no, there's no way
You could also use a DocTransformer. But really, unless these
fields are quite long it seems overkill to do anything but ignore
them when returned for docs you don't care about.
Best
Erick
On Mon, Jul 15, 2013 at 7:05 PM, Jack Krupansky j...@basetechnology.com wrote:
SOLR-5005 -
Is that this one: https://issues.apache.org/jira/browse/SOLR-1913 ?
Regards,
Alex.
Personal website: http://www.outerthoughts.com/
LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
- Time is the quality of nature that keeps events from happening all at
once. Lately, it doesn't seem to
Not quite sure what's the problem with the second, but the
first is:
q=:
That just isn't legal, try q=*:*
As for the second, are there any other errors in the solr log?
Sometimes what's returned in the response packet does not
include the true source of the problem.
Best
Erick
On Mon, Jul 15,
Sorry, but you are basically misusing Solr (and multivalued fields), trying
to take a shortcut to avoid a proper data model.
To properly use Solr, you need to put each of these multivalued field values
in a separate Solr document, with a text field and a value field. Then,
you can query:
Thanks Eric, that is what I suspected. We are very happy with the four
suggestions in the example (and all the others), but we would like to know
which of them represents a full part number.
Can you elaborate a little more how that could be achieved?
Best regards,
Alexander
-Ursprüngliche
Thanks, I was actually asking about deleting nodes from the cluster state
not cores, unless you can unload cores specific to an already offline node
from zookeeper.
On Tue, Jul 16, 2013 at 1:55 AM, Marcin Rzewucki mrzewu...@gmail.comwrote:
Hi,
You should use CoreAdmin API (or Solr Admin
I'm guessing the answer is yes, but here's the background.
We index 2 separate fields, headline and body text for a document, and then
we want to identify the top of the story which is th headline + N words
of the body (we want to weight that in scoring).
So do to that:
copyField src=headline
If you only have one collection and no Solr cloud, then don't use solr.xml
at all. It will automatically assume 'collection1' as a name.
If you do want to have some control (shards, etc), do not include the
optional parameters you do not need. See example here:
Hi ,
We have been using solr 3.6.1 .Recently downloaded the solr 4.3.1 version
and installed the same as multicore setup as follows
Folder Structure
solr.war
solr
conf
core0
core1
solr.xml
Created the context fragment xml file in
Basically, the evaluation of function queries in the fl parameter occurs
when the response writer is composing the document results. That's AFTER all
of the search components are done.
SolrReturnFields.getTransformer() gets the DocTransformer, which is really a
DocTransformers, and then a
Hi
I'm using solr version 4.3.1. I have a core with only one shard and
three replicas, say server1, server2 and server3.
Suppose server1 is currently the leader
if I send an update to the leader everything works fine
wget -O - --header='Content-type: text/xml'
--post-data='adddocfield
Yes, each input value is analyzed separately. Solr passes each input value
to Lucene and then Lucene analyzes each.
You could use LimitTokenPositionFilterFactory which uses the absolute token
position - each successive analyzed value would have an incremented
position, plus the
Hi
I need to create a solr cluster that contains geospatial information and
provides the ability to perform a few hundreds queries per second, each
query should retrieve around 100k results.
The data is around 100k documents, around 300gb total.
I started with 2 shard cluster (replicationFactor
Hi All,
Can you change the configuration of a spellchecker
using solr.DirectSolrSpellCheck after you've built an index? I know that
this spellchecker doesn't build and index off to the side like
the IndexBasedSpellChecker so I'm wondering what's happening internally to
create a spellchecking
Have you looked at cache utilization?
Have you checked the IO and CPU load to see what the bottlenecks are?
Are you sure things like your heap and servlet container threads are tuned?
After you look at those issues, I'd probably think about adding http
caching and more replicas.
Michael Della
I think this is SOLR-4923 https://issues.apache.org/jira/browse/SOLR-4923,
should be fixed in 4.4 (when it comes out) or grab the branch_4x branch
from svn.
On 16 July 2013 14:12, giovanni.bricc...@banzai.it
giovanni.bricc...@banzai.it wrote:
Hi
I'm using solr version 4.3.1. I have a core
Thanks Jack.
There seem to be a never ending set of FilterFactories, I keep hearing
about new ones all the time :)
Ok, I get it, so our existing code is the first N tokens of each value, and
using LimitTokenPositionFilterFactor**y with the same number would give us
the first N of the combined
Self-correction, we'd need to set LimitTokenPositionFilterFactor**y to PI
+ N to give the results above because of the increment gap between values.
On 16 July 2013 17:16, Daniel Collins danwcoll...@gmail.com wrote:
Thanks Jack.
There seem to be a never ending set of FilterFactories, I keep
You only have a 20Gb collection but is that per machine or total
collection, so 10Gb per machine? What memory do you have available on
those 2 machines, is it enough to get the collection into the disk cache?
What OS is it (linux/windows, etc)?
What heap size does your JVM have?
Is it a static
Are you requesting all 100K results in one request? If so, that is pretty fast.
If you are doing that, don't do that. Page the results.
wunder
On Jul 16, 2013, at 9:30 AM, Daniel Collins wrote:
You only have a 20Gb collection but is that per machine or total
collection, so 10Gb per machine?
: I used the reload command to apply changes in synonyms.txt for example, but
: with the new mechanisme https://wiki.apache.org/solr/CoreAdmin#LiveReload
: this will not work anymore.
the Live reload doesn't affect schema.xml settings and analyziers (like
changing stopwords or synonyms) ...
Actually, I appear to be wrong on the position limit filter - it appears to
be relative to the string being analyzed and not the full sequence of values
analyzed for the field.
Given this field and type:
fieldType name=text_limit_position4 class=solr.TextField
positionIncrementGap=10
OK, So thats why I cannot see the FunctionQuery fields in my
SearchComponent class.
So then question would be how can I apply my custom processing/logic to
these FunctionQuery ? Whats the ExtensionPoint in Solr for such scenarios ?
Basically I want to call termfreq() for each document and then
Does anyone know if Issue SOLR-1397 (It should be possible to highlight
external text ) is actively being worked by chance? Looks like the last
update was May 2012.
https://issues.apache.org/jira/browse/SOLR-1397
I'm trying to find a way to best highlight search results even though those
On 7/16/2013 2:02 AM, wolbi wrote:
As I said, if I change it in context.xml it works... but the question is...
how to make it from commandline, without modyfing config files.
Thanks
Take it out of the config file.
Thanks,
Shawn
This problem looks to me because of solr logging ...
see below detail description (taken one of the mail thread)
-
Solr 4.3.0 and later does not have ANY slf4j jarfiles in the .war file,
so you need to put them in
Erick,
I wasn't sure this issue is important, so I wanted first solicit some
feedback. You and Otis expressed interest, and I could create the JIRA -
however, as Alexandre, points out, the SOLR-1913 seems similar (actually,
closer to the Otis request to have the elasticsearch named filter) but
Are you using synonyms during indexing or during query only? If during
indexing, the reloading by itself will not change what was stored - you
need to fully reindex as well.
Regards,
Alex.
Personal website: http://www.outerthoughts.com/
LinkedIn:
Hi,
We are upgrading solr from 3.6 to 4.3, but we have a large amount of indexed
data and could not
afford to to reindex all once.
We wish solr 4.3 could do the following:
1/ still able to search on solr 3.6 indexed data
2/ whenever indexing new document, convert to 4.3 format (may not
My bad. I did some more testing as well and could not replicate the behavior.
Reloading synonyms works fine with a core reload.
Chris Hostetter-3 wrote
: I used the reload command to apply changes in synonyms.txt for example,
but
: with the new mechanisme
Well, I think this is slightly too categorical - a range query on a
substring can be thought of as a simple range query. So, for example the
following query:
lucene 1*
becomes behind the scenes: lucene (10|11|12|13|14|1abcd)
the issue there is that it is a string range, but it is a range query
Thanks Eric,
i've configured both to use 8080 (For wicket this is standard :-)).
Do i have to assign a different port to solr if i use both webapps in
the same container?
Btw. the contextpath for my wicket app is /*
Could that be a problem to?
Per
Am 15.07.2013 17:12, schrieb Erick
Thanks Sandeep,that fixed it.
Regards,
Sujatha
On Tue, Jul 16, 2013 at 10:41 PM, Sandeep Gupta gupta...@gmail.com wrote:
This problem looks to me because of solr logging ...
see below detail description (taken one of the mail thread)
Thanks Alexandre,
Well, the initial question was, whether it is possible to altogether avoid
dealing with collections (extra layer, longer url). But it seems this is an
internal new feature of solr 4 generation. In solr 3 it was just a core,
which could be avoided if no solr.xml was found.
With
Looks like the JoinQParserPlugin is throwing an NPE.
Query: localhost:8983/solr/location/select?q=*:*fq={!join from=key
to=merchantId fromIndex=merchant}
84343345 [qtp2012387303-16] ERROR org.apache.solr.core.SolrCore –
java.lang.NullPointerException
at
Search this mailing list and you will find a very long discussion about the
terminology and confusion around it.My contribution to that was the crude
picture trying to explain it: http://bit.ly/1aqohUf . Maybe it will help.
If you don't want longer URL, do use solr.xml and use @adminPath and
Found this post:
http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201302.mbox/%3CCAB_8Yd82aqq=oY6dBRmVjG7gvBBewmkZGF9V=fpne4xgkbu...@mail.gmail.com%3E
And based on the answer, I modified my query: localhost:8983/solr/location/
select?fq={!join from=key to=merchantId
On 7/16/2013 12:41 PM, Alexandre Rafalovitch wrote:
Search this mailing list and you will find a very long discussion about the
terminology and confusion around it.My contribution to that was the crude
picture trying to explain it: http://bit.ly/1aqohUf . Maybe it will help.
If you don't want
Hi,
Is there any documentation of how to configure SolrCloud Zookeeper using
SASL (on JBOSS 5). When I start SolrCloud on Jboss 5 I see WARN:
/2013-07-16 21:38:17,425 INFO
[org.apache.solr.common.cloud.ConnectionManager:157] (main) Waiting for
client to connect to ZooKeeper
2013-07-16
Hello Everyone,
We are using solrcloud with Tomcat in our production environment.
Here is our configuration.
solr-4.0.0
JVM 1.6.0_25
The JVM keeps crashing everyday with the following error. I think it is
happening while we try index the data with solrj APIs.
INFO: [aq-core] webapp=/solr
Thanks Alexandre, I think I have followed that discussion, there was
another one AFAIR on the dev list.
On your diagram, am I guessing it correctly, that shard1 and shard2 inside
a collection would at least share the same schema?
On Tue, Jul 16, 2013 at 9:41 PM, Alexandre Rafalovitch
I figured it out for anyone finding this thread. I had to add the following
to my solrconfig.xml
luceneMatchVersionLUCENE_31/luceneMatchVersion
http://www.searchspring.net/James Bathgate*Sr. Developer*888.643.9043 ext.
610 http://www.linkedin.com/in/bathgate
On Thu, Jul 11, 2013 at 2:47 PM,
Hi Shawn,
Thanks for your input.
Having spent some time today figuring out the path to upgrade, I concluded
that we have been using what is (and was in solr 3 and possibly earlier)
called a core. A group of two cores (with different schemas) we (probably
mistakenly) referred to as a shard. That
I don't know about jvm crashes, but it is known that the Java 6 jvm had
various problems supporting Solr, including the 20-30 series. A lot of
people use the final jvm release (I think 6_30).
On 07/16/2013 12:25 PM, neoman wrote:
Hello Everyone,
We are using solrcloud with Tomcat in our
I'm trying to find a way to best highlight search results even though
those
results are not stored in my index. Has anyone been successful in
reusing
the SOLR highlighting logic on non-stored data?
I was able to do this by slightly modifying the FastVectorHighlighter so
that it returned
Hi guys,
First of all, thanks for your response.
Jack: Data structure was created some time ago and this is a new
requirement in my project. I'm trying to find a solution. I wouldn't like
to split multivalued field into N similar records varying in this
particular field only. That could impact
Unloading a core is the known way to unregister a solr node in zookeeper
(and not use for further querying). It works for me. If you didn't do that
like this, unused nodes may remain in the cluster state and Solr may try to
use them without a success. I'd suggest to start some machine with the old
Hi Everyone,
I'm using Solr (version 4.3) for the first time and through much research I
got into writing a custom search handler using edismax to do relevancy
searches. Of course, the client I'm preparing the search for also has
synonyms (both bidirectional and explicit). After much research, I
I've sold it removing filter class=solr.DoubleMetaphoneFilterFactory
inject=true/.
But now, i have a problem. If i search for Rocket Bananaa ( with double
'a' ) the result don't appear in first.
Any ideas how to fix it?
--
View this message in context:
Rocket Banana (Single) should be first because its the closest to Rocket
Banana.
How can i get a ideal rank to return closests words in firsts position?
--
View this message in context:
http://lucene.472066.n3.nabble.com/How-to-optimize-a-search-tp4077531p4078470.html
Sent from the Solr -
I want to script the creation of N solr cloud instances (on ec2).
But its not clear to me where I would specify numShards setting.
From documentation, I see you can specify on the first node you start up, OR
alternatively, use the collections API to create a new collection - but in
that case
On Tue, Jul 16, 2013 at 5:08 PM, Marcin Rzewucki mrzewu...@gmail.comwrote:
Hi guys,
First of all, thanks for your response.
Jack: Data structure was created some time ago and this is a new
requirement in my project. I'm trying to find a solution. I wouldn't like
to split multivalued field
In case you were unaware, generalized multi-word synonym expansion is an
unsolved problem in Lucene/Solr. Sure, some of the tools are there and you
can sometimes make it work for some situations, but not for the general
case. Some work has been in progress, but no near-term solution is at hand.
Use fuzzy search instead of phonetic search. Phonetic search is a poor match to
most queries.
At Netflix, we dropped phonetic search and started using fuzzy. There was a
clear improvement in the A/B test.
wunder
On Jul 16, 2013, at 2:25 PM, padcoe wrote:
Rocket Banana (Single) should be
What does the solr.xml look like on the nodes?
On Tue, Jul 16, 2013 at 2:36 PM, Robert Stewart robert_stew...@epam.comwrote:
I want to script the creation of N solr cloud instances (on ec2).
But its not clear to me where I would specify numShards setting.
From documentation, I see you can
Hi Macrin,
May be you can use https://issues.apache.org/jira/browse/SOLR-1604 .
ComplexPhraseQueryParser supports ranges inside phrases.
From: Marcin Rzewucki mrzewu...@gmail.com
To: solr-user@lucene.apache.org
Sent: Wednesday, July 17, 2013 12:08 AM
Subject:
On 7/16/2013 3:36 PM, Robert Stewart wrote:
I want to script the creation of N solr cloud instances (on ec2).
But its not clear to me where I would specify numShards setting.
From documentation, I see you can specify on the first node you start up, OR
alternatively, use the collections API to
Yeah, I was thinking about that.
But... will it properly order 10 as being greater than 9? Usually, we
used trie or sorted field types to assure numeric order, but a text field
doesn't have that feature.
Although I did think that maybe you could have a token filter that mapped
numeric
Hi Dmarin,
Did you consider using http://wiki.apache.org/solr/QueryElevationComponent ?
From: Jack Krupansky j...@basetechnology.com
To: solr-user@lucene.apache.org
Sent: Wednesday, July 17, 2013 12:53 AM
Subject: Re: Searching w/explicit Multi-Word Synonym
Yes, you need to use a different port for Solr.
As for the contextpath, I have no idea.
Best
Erick
On Tue, Jul 16, 2013 at 2:02 PM, Per Newgro per.new...@gmx.ch wrote:
Thanks Eric,
i've configured both to use 8080 (For wicket this is standard :-)).
Do i have to assign a different port to
You can only join on indexed fields, our Location:merchantId field is not
indexed.
Best
Erick
On Tue, Jul 16, 2013 at 2:48 PM, Utkarsh Sengar utkarsh2...@gmail.com wrote:
Found this post:
All of the core loading stuff is on the server side, so CloudSolrServer
isn't really germane (I don't think anyway).
This is in a bit of flux, so try having one core that's loaded on startup
even if it's just a dummy core. There's currently ongoing work to
play nicer with no cores being defined
Maybe it was lost, I tent to babble on... But use a copyField directive
that doesn't have the EdgeNGramTokenizerFactory in the chain
and get your suggestions from _that_ field rather than the one you
do use currently. You can still search etc. on the one you now
have, just get your suggestions
Hello,
1. It depends on your query types data (complexity, featureset,
paging) - geospatial could be something with calculation inside solr?
2. It depends massively on the document size field-selection (load a
hundred of 100MB documents can take some time)
3. It depends especially on your
83 matches
Mail list logo