Hi list
I have written a blog post about the use of SOLR for searching at
Issuuhttp://www.issuu.com
.
To give you a sense of the scale, Issuu indexes more than 9 million
documents and 200 million pages. In January Issuu had 4.3 billion pageviews
and over 125.8 million visits (60.1 unique).
You
define something like postImportDeleteQuery = Select Id
from
delete_log_table. Can someone provide me an example ?
postImportDeleteQuery and preImportDeleteQuery queries are lucene/solr queries.
For example I am using the following:
preImportDeleteQuery=document_type:(photo OR news OR video
Hi
I was trying to connect to solr cloud using CloudServer, I get the following
exception.
I tried clearing the zookeeper state and then restarting the solr instances,
still i get the same exception.
am i missing something?
org.apache.solr.common.cloud.ZkStateReader: Updating cluster state
I am using Solr 4.0.
./zahoor
On 13-Feb-2013, at 3:56 PM, J Mohamed Zahoor zah...@indix.com wrote:
Hi
I was trying to connect to solr cloud using CloudServer, I get the following
exception.
I tried clearing the zookeeper state and then restarting the solr instances,
still i get the
Hi
I think the
router:compositeId
value inside the cluster state is creating this problem.
./Zahoor
On 13-Feb-2013, at 4:06 PM, J Mohamed Zahoor zah...@indix.com wrote:
I am using Solr 4.0.
./zahoor
On 13-Feb-2013, at 3:56 PM, J Mohamed Zahoor zah...@indix.com wrote:
Hi
I
Apologies...
I was using 4.1 in solr server and 4.0 in solrj client which caused this
problem.
./zahoor
On 13-Feb-2013, at 4:08 PM, J Mohamed Zahoor zah...@indix.com wrote:
Hi
I think the
router:compositeId
value inside the cluster state is creating this problem.
./Zahoor
hey,
I want to send query input through json file do not want to give query
parameter. so is there any way to send.
Like if i give query parameter it give response and in response there is a
key call as parameter. so if i send that parameter through json. it will
easy for me.
let say input
I'm not sure if I understood you..
You want to send a request like http://localhost/solr/select?
q=*:*wt=jsonstart=0fq=course_id:18 and get back only parts of the
response for further processing?
Then the easiest way is to retrieve the whole json and post-process only
responseHeader.params.
BR
Hi All,
I try to understand how can I get phrase offsets as a result of search using
SOLRJ client.
I have only one field: field name=contents type=text_general
indexed=true stored=false termVectors=true termPositions=true
termOffsets=true /
I don't want to save the field content in the index.
Ok, I see - you want to send a JSON Object which contains the query
parameters.
As far as I know, that's not possible out-of-the-box, so you'll have to
create a custom SearchHandler
http://lucene.apache.org/solr/4_1_0/solr-core/org/apache/solr/handler/component/SearchHandler.html
for
that.
In
Programming some tests I found that two SolrInputDocuments with the same
fields and values are different.
Trying to figure it out why it's happening I found that the
SolrInputDocument class use a LinkedHashMap.
Is the insert order of the fields important for Solr?
Thank you
--
View this
Hi:
I am working on a proyect where we want to recommend our users products
based on their previous 'likes', purchases and so on (typical stuff of a
recommender system), while we want to let them browse freely the catalogue
by search queries, making use of facets, more-like-this and so on
Maybe it is more about having fast iterations even on a large collection
of fields ?
André
On 02/13/2013 12:43 PM, knort wrote:
Programming some tests I found that two SolrInputDocuments with the same
fields and values are different.
Trying to figure it out why it's happening I found that the
Formatted the mail again.
Hi,
I have two dynamic fields like Product-Name-* and Product-Rating-*.One
document can contain 5 products and respective ratings like below.
str name=Product-Name-0HTC Wildfire S/strstr
name=Product-Name-1Samsung Tab 2/strstr name=Product-Name-2Samsung
Note/strstr
We are beginning talks with our IT department and management about
switching from the google search appliance to Solr. One thing that we need
to figure out is what kind of hardware we are going to require to host the
Solr systems.
What type of hardware (at a high level) should I be looking for.
Hi,
I use Solr 3.6.0 with a synonym filter as the last filter at index time, using
a list of stemmed terms. When i do a wildcard search that matches a part of an
entry on the synonym list, the synonyms found are used by solr to generate the
search results. I am trying to disable that
Hi,
I have a question, hope you can help me.
I would like to get report using the solr admin tools that return the entire
search that made on the system between dates.
What is the correct way to do it?
BR,
Yoel
[cid:image001.jpg@01CE0A0F.77B4D510]
Yoel Rosenberg
ALCATEL-LUCENT
Support Engineer
By doing synonyms at index time, you cause apfelsin to be added to
documents that contain only orang, so of course documents that previously
only contained orang will now match for apfelsin or any term query that
matches apfelsin, such as a wildcard. At query time, Lucene cannot tell
whether
Cool that it worked :)
I had this same problem in my project a few months ago
On Tue, Feb 12, 2013 at 12:57 PM, Sandeep Mestry sanmes...@gmail.comwrote:
Hi Felipe, Just a short note to say thanks for your valuable suggestion. I
had implemented that and could see expected results. The length
Matthew Shapiro [m...@mshapiro.net] wrote:
[Hardware for Solr]
What type of hardware (at a high level) should I be looking for. Are the
main constraints disk I/O, memory size, processing power, etc...?
That depends on what you are trying to achieve. Broadly speaking, simple
search and
Thanks, very interesting.
The admin interface is very useful (although it would be useful with a
sample admin-extras.html file somewhere - where it should go and what
can go in it would be good to know. Right now, all we get is an
exception in the logs about the file not existing).
You only
I am looking at the source code of 4.1.0 and I cannot find any prove that
solr 4.1.0's DIH would actually use any properties from the
solrcore.properties file.
I do however found that Solr does load my solrcore.properties file...
It's strange that this would have been changed,
Does anybody have
Thanks for the reply.
If the main amount of searches are the exact same (e.g. the empty search),
the result will be cached. If 5,683 searches/month is the real count, this
sounds like a very low amount of searches in a very limited corpus. Just
about any machine should be fine. I guess I am
Matthew,
With an index that small, you should be able to build a proof of
concept on your own hardware and discover how it performs using
something like SolrMeter:
Michael Della Bitta
Appinions
18 East 41st Street, 2nd Floor
New York, NY
Ooops: https://code.google.com/p/solrmeter/
Michael Della Bitta
Appinions
18 East 41st Street, 2nd Floor
New York, NY 10017-6271
www.appinions.com
Where Influence Isn’t a Game
On Wed, Feb 13, 2013 at 12:25 PM, Michael Della Bitta
That definitely will be a useful tool in this conversion, thanks.
On Wed, Feb 13, 2013 at 12:25 PM, Michael Della Bitta
michael.della.bi...@appinions.com wrote:
Ooops: https://code.google.com/p/solrmeter/
Michael Della Bitta
Appinions
18
On 2/13/2013 9:52 AM, Andre Bois-Crettez wrote:
Thanks, very interesting.
The admin interface is very useful (although it would be useful with a
sample admin-extras.html file somewhere - where it should go and what
can go in it would be good to know. Right now, all we get is an
exception in the
The code that resolves variables in DIH was refactored extensively in 4.1.0.
So if you've got a case where it does not resolve the variables properly,
please give the details. We can open a JIRA issue and get this fixed.
James Dyer
Ingram Content Group
(615) 213-4311
-Original
Hi,
I have opened a couple of jira's, one to make the HttpShardHandlerFactory and
LBHttpSolrServer more easily extended:
https://issues.apache.org/jira/browse/SOLR-4448 and one with an implementation
of a backup requesting load balancer :
https://issues.apache.org/jira/browse/SOLR-4449 .
The
Have you looked at the pf parameter for dismax handlers? pf does I think
what you are looking for which is to boost documents with the query term
exactly matching in the various fields with some phrase slop.
On Wed, Feb 13, 2013 at 2:59 AM, Hemant Verma hemantverm...@gmail.comwrote:
Hi All
I
Ultimately this is dependent on what your metrics for success are. For some
places it may be just raw CTR (did my click through rate increase) but for
other places it may be a function of money (either it may be gross revenue,
profits, # items sold etc). I don't know if there is a generic answer
So just a hunch... but when the slave downloads the data from the master,
doesn't it do a commit to force solr to recognize the changes? In so doing,
wouldn't that increase the generation number? In theory it shouldn't matter
because the replication looks for files that are different to determine
A search for id is much too broad. I looked at 3 of the SolrCloud classes you
mention and none of those id's have anything to do with the unique field in
the schema. I have not looked at the hash based router, but if you find a real
issue then please file a JIRA issue.
- Mark
On Feb 12,
On Feb 13, 2013, at 1:17 PM, Amit Nithian anith...@gmail.com wrote:
doesn't it do a commit to force solr to recognize the changes?
yes.
- Mark
Ah, you mention most of the SolrCloud ones don't look like a problem.
The other two then:
1. RealTimeGetComponent - doesn't look like a schema field usage, don't see a
problem.
2. HashBasedRouter - looks like it could be a problem and this is new for 4.1 -
this is something we should document
is there a strong reason why we still need solr.xml on disk and it cannot
be persisted and used from in zookeeper ?
thanks,
--
Anirudha P. Jadhav
: Is the insert order of the fields important for Solr?
svn blame can frequently be useful for understanding why specific
choices were made...
http://svn.apache.org/viewvc?view=revisionrevision=604951
https://issues.apache.org/jira/browse/SOLR-439
...nut shell: it may not matter to you what
Yes, though the reasons are not so interesting.
Soon solr.xml is going away regardless - perhaps in a another release or two.
- mark
On Feb 13, 2013, at 2:02 PM, Anirudha Jadhav aniru...@nyu.edu wrote:
is there a strong reason why we still need solr.xml on disk and it cannot
be persisted and
Matthew Shapiro [m...@mshapiro.net] wrote:
Sorry, I should clarify our current statistics. First of all I meant 183k
documents (not 183, woops). Around 100k of those are full fledged html
articles (not web pages but articles in our CMS with html content inside
of them),
If an article is
there's an open feature request about this, but part of the problem is
that it's extremely hard to implement something like this efficiently in a
distributed query...
https://issues.apache.org/jira/browse/SOLR-1712
: Date: Wed, 6 Feb 2013 20:03:16 -0800
: From: Neelesh
: suggester simply looks at the terms in the index and returns some of them,
: it's not aware (that I know of) of which docs the terms came from, so I
I'm not certain, but isn't there where the spellcheck.collate option can
be used?
Okay so then that should explain the generation difference of 1 between the
master and slave
On Wed, Feb 13, 2013 at 10:26 AM, Mark Miller markrmil...@gmail.com wrote:
On Feb 13, 2013, at 1:17 PM, Amit Nithian anith...@gmail.com wrote:
doesn't it do a commit to force solr to recognize the
Try the spellchecker rather than the suggester/auto-complete:
http://wiki.apache.org/solr/SpellCheckComponent
-- Jack Krupansky
-Original Message-
From: ALEX PKB
Sent: Wednesday, February 13, 2013 2:34 PM
To: solr-user@lucene.apache.org
Subject: auto-complete with typo fuzzy
Thank you Ahmet.
I figured it out. I had to define a separate entity which takes care of
deletes.
entity name=DeleteEntity query=
SELECT ID AS [$deleteDocById] FROM Log
WHERE '${dataimporter.request.clean}' = 'false'
AND Log_Date = '${dataimporter.last_index_time}'
The key to get this working is to set spellcheck.maxCollationTries 0. It
will generate collations even if there is only 1 term.
James Dyer
Ingram Content Group
(615) 213-4311
-Original Message-
From: Chris Hostetter [mailto:hossman_luc...@fucit.org]
Sent: Wednesday, February 13,
All,
Thank you for your comments and links, I will explore them.
I think that many people are facing similar questions - when they tune
their search engines. Especially in Solr/Lucene community. While the
requirements will be different, ultimately it is what they can do w
lucene/solr that guides
Excellent, thank you very much for the reply!
On Wed, Feb 13, 2013 at 2:08 PM, Toke Eskildsen t...@statsbiblioteket.dkwrote:
Matthew Shapiro [m...@mshapiro.net] wrote:
Sorry, I should clarify our current statistics. First of all I meant
183k
documents (not 183, woops). Around 100k of
: Thanks for confirming my suspicions, the custom
: TokenLengthMarkerFilterFactory sounds like the best approach for doing this.
that sounds like something that could be generally useful to lots of
people ... by all means please open a jira issue and attach whatever you
come up with for
James,
I debugged it until I found where things go 'wrong'.
Apparently the current implementation VariableResolver does not allow the
use of a period '.' in any variable/property key you want to use... It's
reserved for namespaces.
Personally I would really love to use a period in my
I think the order needs to be in lowercase. Try asc instead of ASC.
-Michael
-Original Message-
From: PeterKerk [mailto:vettepa...@hotmail.com]
Sent: Wednesday, February 13, 2013 7:30 PM
To: solr-user@lucene.apache.org
Subject: Can't determine Sort Order: 'prijs ASC', pos=5
On this
Or is there a way to achieve this using EDismax query parser ?
From: pragyans...@outlook.com
To: solr-user@lucene.apache.org
Subject: RE: Search over dynamic fields
Date: Wed, 13 Feb 2013 19:09:24 +0530
Formatted the mail again.
Hi,
I have two dynamic fields like Product-Name-* and
Ah yes sorry mis-understood. Another option is to use n-grams so that
projectmanager is a term so any query involving project manager in india
with 2 years experience would match higher because the query would contain
projectmanager as a term.
On Wed, Feb 13, 2013 at 9:56 PM, Hemant Verma
Hi Pragyanshis,
What happens when you remove bq parameter?
--- On Thu, 2/14/13, Pragyanshis Pattanaik pragyans...@outlook.com wrote:
From: Pragyanshis Pattanaik pragyans...@outlook.com
Subject: Why a phrase is getting searched against default fields in solr
To: solr Forum
I agree that's definitely strange, I'll have a look at it.
Tommaso
2013/2/12 Chris Hostetter hossman_luc...@fucit.org
: So it seems that facet.query is using the analyzer of type index.
: Is it a bug or is there another analyzer type for the facet query?
That doesn't really make any
OK then index generation and index version are out of count when it comes
to verify that master and slave index are in sync.
What else is possible?
The strange thing is if master is 2 or more generations ahead of slave then it
works!
With your logic the slave must _always_ be one generation
55 matches
Mail list logo