Thanks Sujit and all for your views about semantic search in solr.
But How do i proceed towards, i mean how do i start off the things to get
on track ?
On Sat, Mar 8, 2014 at 10:50 PM, Sujit Pal sujit@comcast.net wrote:
Thanks for sharing this link Sohan, its an interesting approach.
On Sun, 2014-03-09 at 19:55 +0100, abhishek jain wrote:
I am confused should i keep two separate indexes or keep one index with two
versions or column , i mean col1_stemmed and col2_unstemmed.
1 index with stemmed unstemmed will be markedly smaller than 2 indexes
(one with stemmed, one with
does this maxClauseCount go over each field individually or all put together?
is it the date fields?
when i execute a query i get this error:
lst name=responseHeader int name=status500/int int
name=QTime93/int lst name=params str name=indenttrue/str
str name=qEin PDFchen als
Hi,
When our server crashes the memory fills up fast. So I think it might
be a specific query that causes our servers to crash. I think the query
won't be logged because it doesn't finish. Is there anything we can do
to see the currently running queries in de Solr server (so when can see
Hi all,
we are using SolrCloud with this configuration :
* SolR 4.4.0
* Zookeeper 3.4.5
* one server with zookeeper + 4 solr nodes
* one server with 4 solr nodes
* only one core
* Solr instances deployed on tomcats with mod_cluster
*
As of now index is on 136 GB.
I want to understand can we do multiple write on solr? I don't have any
partitioning strategy as of now.
On Amazon instance for Solr the disk read/write a like 5% or so . I am not
able to understand even though I am almost processing 300 records per min
how come
Hi all.
I need your help! I have read every post about Spatial in Solr because I
need to check if a point (latitude,longitude) is inside a Polygon.
/**/
/* 1. library */
/**/
(1) I use jts-1.13.jar and spatial4j-0.4.1.jar
(I think they are the latest version)
Excellent, thank you.
Lee
--
View this message in context:
http://lucene.472066.n3.nabble.com/Solr-Production-Installation-tp4122091p4122533.html
Sent from the Solr - User mailing list archive at Nabble.com.
We just upgraded our dev environment from Solr 4.6 to 4.7 and our search
posts are now returning a Search requests cannot accept content streams
error. We did not install over top of our 4.6 install, we installed into a
new folder.
org.apache.solr.common.SolrException: Search requests cannot
Hi,
As a solution, i have tried a combination of PatternTokenizerFactory and
PatternReplaceFilterFactory .
In both query and indexer i have written:
tokenizer class=solr.PatternTokenizerFactory pattern=\s+ /
filter class=solr.PatternReplaceFilterFactory pattern=([^-\w]+)
replacement= punct
On 3/10/2014 6:20 AM, abhishek jain wrote:
tokenizer class=solr.PatternTokenizerFactory pattern=\s+ /
filter class=solr.PatternReplaceFilterFactory pattern=([^-\w]+)
replacement= punct replace=all/
snip
Is there a way i can tokenize after application of filter, please suggest i
know i am
I want list of users who are online and fulfill the criteria specified.
Current implementation:
I am sending post parameters of online ids(usually 20k) with search
criteria.
How i want to optimize it:
I must change the internal code of solr so that these 20k profiles are
fetching from solr
Hi Priti;
100 qps is not much but 7 GB is too low and it may be a problem for you. I
have a tens of nodes of SolrCloud and I send data them via Map/Reduce via
tens of servers. However indexing speed did not be a problem for me yet.
Problems occurs because of network communication, RAM or
Hi,
Brief Description of my application:
We have a java program which reads a flat file, and adds document to solr
using cloudsolrserver.
And we index for every 1000 documents(bulk indexing).
And the Autocommit setting of my application is:
autoCommit
maxDocs10/maxDocs
Hi Metin;
I think that timeout value you are talking about is that:
http://zookeeper.apache.org/doc/r3.1.2/zookeeperStarted.html However it is
not recommended to change timeout value of Zookeeper if you do not have a
specific reason. On the other hand how many Zookeepers do you have at your
Thanks Furkam,
This give some really good understanding. We have Amazon Instance and right
now it is running on m1.large.
In Amazon we are not finding a support to increase ONLY RAM ! that our main
concern and we are actively looking which instance can help us to support
this index size.
Do you
Hi all,
Following throw The request resource is not available
curl
http://localhost:8080/solr/#/dev/update/extract?stream.file=/home/priti/$fileliteral.id=document$icommit=true
I don't understand what is literal.id ?? Is it mandatory. [Please share
reading links if known]
/headbodyh1HTTP
Hi,
How much refreshes do you need? Can you live with 3-5 minutes refresh rate?
If you can effort to query mysql for every single query, consider using post
filter :
http://searchhub.org/2012/02/22/custom-security-filtering-in-solr/
Ahmet
On Monday, March 10, 2014 2:56 PM, lavesh
Hi;
Did you read here:
http://searchhub.org/2013/08/23/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/
Thanks;
Furkan KAMACI
2014-03-10 15:14 GMT+02:00 RadhaJayalakshmi rlakshminaraya...@inautix.co.in
:
Hi,
Brief Description of my application:
We have a java program
Any news regarding this ?
I'm investigating in Solr offline clustering as well ( full index
clustering).
Cheers
2012-09-17 20:16 GMT+01:00 Denis Kuzmenok forward...@ukr.net:
Sorry for late response. To be strict, here is what i want:
* I get documents all the time. Let's assume those are
Hi,
I have a performance and scoring problem for phrase queries
1. Performance - phrase queries involving frequent terms are very slow
due to the reading of large positions posting list.
2. Scoring - I want to control the boost of phrase and entity (in
gazetteers) matches
Indexing
Hi guys,
I'm looking around to find out if it's possible to have a full-index
/Offline cluster.
My scope is to make a full index clustering ad for each document have the
cluster field with the id/label of the cluster at indexing time.
Anyone know more details regarding this kind of integration
Hi Alessandro,
Generally Apache mahout http://mahout.apache.org is recommended for offline
clustering.
Ahmet
On Monday, March 10, 2014 4:11 PM, Alessandro Benedetti
benedetti.ale...@gmail.com wrote:
Hi guys,
I'm looking around to find out if it's possible to have a full-index
/Offline
Hi
Got it working!
Much thanks for you response.
On Sat, Mar 8, 2014 at 7:40 PM, Furkan KAMACI furkankam...@gmail.comwrote:
Hi;
Could you check here:
http://lucene.472066.n3.nabble.com/Error-when-creating-collection-in-Solr-4-6-td4103536.html
Thanks;
Furkan KAMACI
2014-03-07 9:44
literal.id should contain a unique identifier for each document (assuming
that the unique identifier field in your solr schema is called id); see
http://wiki.apache.org/solr/ExtractingRequestHandler .
I'm guessing that the url for the ExtractinRequestHandler is incorrect, or
maybe you haven't
Thank you, Ahmet, i already know Mahout.
What i was curious is if already exists an integration in Solr for Offline
clustering ...
Reading the wiki we can find this phrase : While Solr contains an
extension for for full-index clustering (*off-line* clustering) this
section will focus on
Merhaba Furkan,
We are planning to migrate to 3 nodes in an ensemble, but by now we have only
one active zookeeper instance in production.
Actually, I thought about a param somewhere in Solr configuration. I may be
wrong but I thought that the problem was due to the fact that Solr asks or
The # character introduces the fragment portion of a URL, so
/dev/update/extract is not a part of the path of the URL. In this case
the URL path is /solr/ and the server is simply complaining that there
is no code registered to process that path.
Normally, the collection name (core name)
You need to either quote your query (after the colon, and another at the
very end), or escape any special characters, or use a different query
parser like “field”. I prefer to use the field query parser:
{!field f=loc}Intersects(POLYGON(...
~ David
On 3/6/14, 10:52 AM, leevduhl
On 3/10/14, 6:45 AM, Javi javiersangra...@mitula.com wrote:
Hi all.
I need your help! I have read every post about Spatial in Solr because I
need to check if a point (latitude,longitude) is inside a Polygon.
/**/
/* 1. library */
/**/
(1) I use jts-1.13.jar and
The clause limit covers all clauses (terms) in one Lucene BooleanQuery - one
level of a Solr query, where a parenthesized sub-query is a separate level
and counts as a single clause in the parent query.
In this case, it appears that the wildcard is being expanded/rewritten to a
long list of
Hi, I have a few text fields indexed and when searching I need to know what
field matched. For example I have fields:
{code}
full_name, site_source, tweets, rss_entries, etc
{code}
When searching I need to show results and show scores per each field. So an
user can see what exactly content match
On 3/10/14, 12:12 PM, Smiley, David W. dsmi...@mitre.org wrote:
c) I tried no WKT format by adding a comma and using longitude,latitude
doc
arr name=LOCATION
str40.442179,-3.69278/str
/arr
/doc
That is *wrong*. Remove the comma and it will then be okay. But again,
see my
David Smiley (@MITRE.org) wrote
On 3/10/14, 6:45 AM, Javi lt;
javiersangrador@
gt; wrote:
/**/
/* 1. library */
/**/
(1) I use jts-1.13.jar and spatial4j-0.4.1.jar
(I think they are the latest version)
You should only need to add JTS; spatial4j is included in
Take a look at the explain section of the results when you set the
debugQuery=true parameter.
Also set the debug.explain.structured=true parameter to get a structured
representation of the explain section.
-- Jack Krupansky
-Original Message-
From: heaven
Sent: Monday, March 10,
Changes from the previous release are primarily off-heap FieldCache
support for strings as well as as all numerics (the previous release
only had integer support).
Benchmarks for string fields here:
http://heliosearch.org/hs-solr-off-heap-fieldcache-performance
Try it out here:
On 3/10/14, 12:56 PM, javinsnc javiersangra...@mitula.com wrote:
/*/
/* Document contents */
/*/
I have tried with 3 different content for my documents (lat-lon refers
to
Madrid, Spain):
Um…. Just to be absolutely sure, are you adding the data in
David Smiley (@MITRE.org) wrote
On 3/10/14, 12:56 PM, javinsnc lt;
javiersangrador@
gt; wrote:
This is indeed the source of the problem.
Why do you index with Lucene’s API and not Solr’s? Solr not only has a
web-service API but it also has the SolrJ API that can embed Solr —
You're going to have to use the Lucene-spatial module directly then. There's
SpatialExample.java to get you started.
javinsnc wrote
David Smiley (@MITRE.org) wrote
On 3/10/14, 12:56 PM, javinsnc lt;
javiersangrador@
gt; wrote:
This is indeed the source of the problem.
Why do you
Could you please send me where I can find this .java?
What do you refer by Lucene-spatial module?
Thanks for your time David!
--
View this message in context:
http://lucene.472066.n3.nabble.com/Solr-spatial-search-within-the-polygon-tp4101147p4122642.html
Sent from the Solr - User mailing
Hi;
If you have any other problems you can ask them too.
Thanks;
Furkan KAMACI
2014-03-10 16:17 GMT+02:00 Vineet Mishra clearmido...@gmail.com:
Hi
Got it working!
Much thanks for you response.
On Sat, Mar 8, 2014 at 7:40 PM, Furkan KAMACI furkankam...@gmail.com
wrote:
Hi;
Lucene has multiple modules, one of which is spatial. You'll see it in the
source tree checkout underneath the lucene directory.
Javadocs: http://lucene.apache.org/core/4_7_0/spatial/index.html
SpatialExample.java:
Ok David. I give it a shot.
Thanks again!
--
View this message in context:
http://lucene.472066.n3.nabble.com/Solr-spatial-search-within-the-polygon-tp4101147p4122647.html
Sent from the Solr - User mailing list archive at Nabble.com.
Hi,
Thats weird. As far as I know there is no such thing. There is classification
stuff but I haven't heard of clustering.
http://soleami.com/blog/comparing-document-classification-functions-of-lucene-and-mahout.html
May be others (Dawid Weiss) can clarify?
Ahmet
On Monday, March 10, 2014
Thats weird. As far as I know there is no such thing. There is
classification stuff but I haven't heard of clustering.
http://soleami.com/blog/comparing-document-classification-functions-of-lucene-and-mahout.html
I think the wording on the wiki page needs some clarification -- Solr
contains
Hi Ahmet, Ale,
right, there's a classification module for Lucene (and therefore usable in
Solr as well), but no clustering support there.
Regards,
Tommaso
2014-03-10 19:15 GMT+01:00 Ahmet Arslan iori...@yahoo.com:
Hi,
Thats weird. As far as I know there is no such thing. There is
Hi Staszek, Tommaso,
Thanks for the clarification.
Ahmet
On Monday, March 10, 2014 8:23 PM, Tommaso Teofili tommaso.teof...@gmail.com
wrote:
Hi Ahmet, Ale,
right, there's a classification module for Lucene (and therefore usable in
Solr as well), but no clustering support there.
Regards,
Maybe I spoke too soon.
The second and third filter parameter *fq={!cache=false cost=50}ClientID:4*and
*fq={!cache=false cost=150}StartDate:[NOW/DAY TO NOW/DAY+1YEAR] *above are
not getting executed, unless I make it the first parameter. And when it's
the first filter parameter the Qtime goes
Salman,
It looks like what you describe has been implemented at Twitter.
Presentation from the recent Lucene / Solr Revolution conference in Dublin:
http://www.youtube.com/watch?v=AguWva8P_DI
On Sat, Mar 8, 2014 at 4:16 PM, Salman Akram
salman.ak...@northbaysolutions.net wrote:
The issue
I am seeing a error when doing a spatial search where a particular point
is showing up within a polygon, but by all methods I've tried that point is
not within the polygon..
First the point is: 41.2299,29.1345 (lat/lon)
The polygon is:
31.2719,32.283
31.2179,32.3681
31.1333,32.3407
Minor edit to the KML to adjust color of polygon
On Mon, Mar 10, 2014 at 4:21 PM, Steven Bower smb-apa...@alcyon.net wrote:
I am seeing a error when doing a spatial search where a particular point
is showing up within a polygon, but by all methods I've tried that point is
not within the
..Spawning this as a separate thread..
So I have a filter query with multiple fq parameters. However, I have
noticed that only the first fq is used for filtering. For instance, a
lookup with
...fq=ClientID:2
fq=HotelID:234-PPP
fq={!cache=false}StartDate:[NOW/DAY TO *]
In the above query,
Weirdly that same point shows up in the polygon below as well, which in the
area around the point doesn't intersect with the polygon in my first msg...
29.0454,41.2198
29.2349,41.1826
31.1107,40.9956
38.437,40.7991
41.1616,40.8988
What are some example values of the HotelID and StateDate fields that are
not getting filtered out?
Multiple fq queries will be ANDed.
-- Jack Krupansky
-Original Message-
From: Vijay Kokatnur
Sent: Monday, March 10, 2014 4:51 PM
To: solr-user
Subject: Multiple fq parameters are not
Solr has extensive filtering tests.
The first step would be to double check that you see what you think
you are seeing, and then try and create an example to reproduce it.
For example, this works fine with the example data, and is of the
same form as your query:
http://localhost:8983/solr/query
On 3/10/2014 6:14 AM, leevduhl wrote:
We just upgraded our dev environment from Solr 4.6 to 4.7 and our search
posts are now returning a Search requests cannot accept content streams
error. We did not install over top of our 4.6 install, we installed into a
new folder.
Hello!
Luke 4.7.0 has been released. Download it here:
https://github.com/DmitryKey/luke/releases/tag/4.7.0
Release based on pull request of Petri Kivikangas (
https://github.com/DmitryKey/luke/pull/2) Kiitos, Petri!
Tested against the solr-4.7.0 index.
1. Upgraded maven plugins.
2. Added
I get a different error (but related to the same issue I guess) with
the following simple query:
/opt/code/heliosearch/solr$ curl -XPOST
http://localhost:8983/solr/select?q=*:*;
?xml version=1.0 encoding=UTF-8?
response
lst name=errorstr name=msgMust specify a Content-Type header
with POST
Thanks Erick. The links you provided are invaluable.
Here are our commit settings. Since we have NRT search, softCommit is set
to 1000s which explains why cache is constantly invalidated.
autoCommit
maxTime60/maxTime
openSearcherfalse/openSearcher
/autoCommit
Hi Steven,
Set distErrPct to 0 in order to get non-point shapes to always be as accurate
as maxDistErr. Point shapes are always that accurate. As long as you only
index points, not other shapes (you don’t index polygons, etc.) then distErrPct
of 0 should be fine. In fact, perhaps a future
Pardon my typo. I meant 1000ms in my last mail.
Thanks,
-Vijay
On Mon, Mar 10, 2014 at 4:22 PM, Vijay Kokatnur kokatnur.vi...@gmail.comwrote:
Thanks Erick. The links you provided are invaluable.
Here are our commit settings. Since we have NRT search, softCommit is set
to 1000s which
Hi Sohan,
You would be the best person to answer your question of how to proceed :-).
From your original query term musical events in New York rewriting to
musical nights at ABC place OR concerts events OR classical music
event you would have to build into your knowledge base that ABC place is
a
You have to separate out a couple of things.
First, data gets written to segments _without_
the segment getting closed and _before_ you
commit. What happens is that when
ramBufferSizeMB in solrconfig.xml is exceeded,
its contents are flushed to the currently-opened
segment. The segment is _not_
Having a couple of docs that aren't being
returned that you think should be would
help.
It's tangential, but you might get better
performance out of this when you get over
your initial problem by using something like
fq=StartDate:[NOW/DAY TO NOW/DAY+1DAY]
That'll filter on all docs with
I'm trying to use the new InfixSuggester exposed in 4.7 and I'm getting
some errors on startup, they don't seem to necessarily cause any problems,
my app still seems to run, but I get the following:
17:28:54.721 WARN {coreLoadExecutor-4-thread-1} [o.a.s.core.SolrCore] :
[vpr] Solr index
Hello
is there a fix for the NOW rounding
Otherwise i have to get current date and crreate a range query like
* TO -MM-ddThh:mm:ssZ
--
View this message in context:
http://lucene.472066.n3.nabble.com/Filter-query-not-working-for-time-range-tp4122441p4122723.html
Sent from the Solr - User
Where do you live? Is it possible you're getting fooled by the fact
that Solr uses UTC?
Solr doesn't distinguish between dates and times, they're all just
unix timestamps.
And, taking into account the time difference between now and UTC in my
time zone it works perfectly for me.
Best,
Erick
On
This looks like a codec issue, but I'm not sure how to address it. I've
found that a different instance of DocsAndPositionsEnum is instantiated
between my code and Solr's TermVectorComponent.
Mine:
org.apache.lucene.codecs.lucene41.Lucene41PostingsReader$EverythingEnum
Solr:
Only points in the index.. Am I correct this won't require a reindex?
On Monday, March 10, 2014, Smiley, David W. dsmi...@mitre.org wrote:
Hi Steven,
Set distErrPct to 0 in order to get non-point shapes to always be as
accurate as maxDistErr. Point shapes are always that accurate. As long
Send the queries.
On Fri, Mar 7, 2014 at 2:32 PM, EXTERNAL Taminidi Ravi (ETI,
Automotive-Service-Solutions) external.ravi.tamin...@us.bosch.com wrote:
Hi All,
I am facing a strange behavior with the Solr Server. All my joins are not
working suddenly after a restart. Individual collections
Really, how can anyone help with this little information?
Please read:
http://wiki.apache.org/solr/UsingMailingLists
Best,
Erick
On Mon, Mar 10, 2014 at 10:03 PM, William Bell billnb...@gmail.com wrote:
Send the queries.
On Fri, Mar 7, 2014 at 2:32 PM, EXTERNAL Taminidi Ravi (ETI,
Correct, Steve. Alternatively you can also put this option in your query
after the end of the last parenthesis, as in this example from the wiki:
fq=geo:IsWithin(POLYGON((-10 30, -40 40, -10 -20, 40 20, 0 0, -10 30)))
distErrPct=0
~ David
Steven Bower wrote
Only points in the index.. Am I
Hello, I think you are confused between two different index
structures, probably because of the name of the options in solr.
1. indexing term vectors: this means given a document, you can go
lookup a miniature inverted index just for that document. That means
each document has term vectors which
We have a cluster running SolrCloud 4.7 built 2/25. 10 shards with 2
replicas each (20 shards total) at about ~20GB/shard.
We index around 1k-1.5k documents/second into this cluster constantly. To
manage growth we have a scheduled job that runs every 3 hours to prune
documents based on business
the link you provided has no information about customizing
--
View this message in context:
http://lucene.472066.n3.nabble.com/How-to-customize-Solr-tp4122551p4122760.html
Sent from the Solr - User mailing list archive at Nabble.com.
75 matches
Mail list logo