,
Mike Anderson
I'd second the request for more information on the current state of
SolrCloud. I have a 16 shard Solr setup in production running 1.3, and a lot
of the features of SolrCloud would make my life a lot easier.
Cheers,
Mike
On Sat, Jul 24, 2010 at 12:52 PM, Dennis Gearon gear...@sbcglobal.netwrote:
You might check out Luke, the Lucene Index Toolbox.
http://www.getopt.org/luke/
I know you can browse the index and get frequency counts, though I'm not
sure if you can export the entire index as a list like what you're looking
for.
Hope this helps,
Mike
On Mon, Sep 6, 2010 at 10:52 AM, Roland
at 10:33 AM, Markus Jelsma markus.jel...@buyways.nlwrote:
There is a recent thread on this one
http://www.mail-archive.com/solr-user@lucene.apache.org/msg40491.html
On Friday 24 September 2010 16:30:36 mike anderson wrote:
What is the right way to upgrade a solr index from Lucene 2.9.1 to 3.x
.
If anybody has some insight into this kind of project I'd love to get some
feedback.
Thanks in advance,
Mike Anderson
at java.net's Faban benchmarking
framework. We use it extensively for our acceptance tests and tuning
excercises.
Joshua
On Oct 27, 2009, at 1:59 PM, Mike Anderson wrote:
I've been making modifications here and there to the Solr source
code in
hopes to optimize for my particular setup. My
I took a look through my Solr logs this weekend and noticed that the longest
queries were on particular fields, like author:albert einstein. Is this a
result consistent with other setups out there? If not, Is there a trick to
make these go faster? I've read up on filter queries and use those when
You can see what revision the patch was written for at the top of the patch,
it will look like this:
Index: org/apache/solr/handler/MoreLikeThisHandler.java
===
--- org/apache/solr/handler/MoreLikeThisHandler.java (revision 772437)
erickerick...@gmail.com
wrote:
H, are you sorting? And has your readers been reopened? Is the
second query of that sort also slow? If the answer to this last question
is
no,
have you tried some autowarming queries?
Best
Erick
On Mon, Nov 2, 2009 at 4:34 PM, mike anderson
This is somewhat of an odd use-case for MLT. Basically I'm using it for
near-duplicate detection (I'm not using the built in dup detection for a
variety of reasons). While this might sound like an okay idea, the problem
lies in the order of which things happen. Ideally, duplicate detection would
I'm trying to understand how content stream works with respect to MLT. I did
a regular MLT query using a document ID and specifying two fields to do MLT
on and got back a set of results. I then copied the xml for the document
with the aforementioned ID and pasted it to a text file. Then I made the
How exactly is MLT calculated? I'm trying to gain an intuition for it by
tweaking the parameters MLT.qf, MLT.mintf, and MLT.mindf (mostly the former,
changing boosts), but so far it's a bit counter intuitive. How does
MLT.boost play in?
If anybody could point me to a technical description
I am getting this exception as well, but disk space is not my problem. What
else can I do to debug this? The solr log doesn't appear to lend any other
clues..
Jan 25, 2010 4:02:22 PM org.apache.solr.core.SolrCore execute
INFO: [] webapp=/solr path=/update params={} status=500 QTime=1990
Jan 25,
I think you might be looking for Apache Tika.
On Mon, Jan 25, 2010 at 3:55 PM, Frank van Lingen fr...@vanlingen.namewrote:
I recently started working with solr and find it easy to setup and tinker
with.
I now want to scale up my setup and was wondering if there is an
application/component
There might be an OCR plugin for Apache Tika (which does exactly this out of
the box except for OCR capability, i believe).
http://lucene.apache.org/tika/
-mike
2010/2/4 Krantiā¢ K K Parisa kranti.par...@gmail.com
Hi,
Can anyone list the best OCR APIs available to use in combination with
Has anybody got solr-ruby to return a clustering result? (using the
clustering component)
I'm almost certain the query is correct (I check the solr logs for the
query and run it in my browser, get back the cluster output as
expected). But when I dump the response from my solr-ruby query the
It seemed like SOLR-1316 was a little too long to continue the conversation.
Is there support for quotes indicating a phrase query. For example, my
autosuggest query for mike sha ought to return mike shaffer, mike
sharp, etc. Instead I get suggestions for mike and for sha, resulting
in a collated
I'm exploring the possibility of using cores as a solution to bookmark
folders in my solr application. This would mean I'll need tens of thousands
of cores... does this seem reasonable? I have plenty of CPUs available for
scaling, but I wonder about the memory overhead of adding cores (aside from
On Fri, Oct 22, 2010 at 1:12 AM, Jonathan Rochkind rochk...@jhu.edu
wrote:
No, it does not seem reasonable. Why do you think you need a seperate
core
for every user?
mike anderson wrote:
I'm exploring the possibility of using cores as a solution to bookmark
folders in my solr
://wiki.apache.org/solr/CoreAdmin
Since Solr 1.3
On Fri, Oct 22, 2010 at 1:40 PM, mike anderson saidthero...@gmail.com
wrote:
Thanks for the advice, everyone. I'll take a look at the API mentioned
and
do some benchmarking over the weekend.
-Mike
On Fri, Oct 22, 2010 at 8:50 AM
, Oct 26, 2010 at 10:15 AM, Jonathan Rochkind rochk...@jhu.eduwrote:
mike anderson wrote:
I'm really curious if there is a clever solution to the obvious problem
with: So your better off using a single index and with a user id and use
a query filter with the user id when fetching data., i.e
:
On Wed, 2010-10-27 at 14:20 +0200, mike anderson wrote:
[...] By my simple math, this would mean that if we want each shard's
index to be able to fit in memory, [...]
Might I ask why you're planning on using memory-based sharding? The
performance gap between memory and SSDs is not very big so
Making sure the index can fit in memory (you don't have to allocate that
much to Solr, just make sure it's available to the OS so it can cache it --
otherwise you are paging the hard drive, which is why you are probably IO
bound) has been the key to our performance. We recently opted to use less
Not sure if this was mentioned yet, but if you are doing slave/master
replication you'll need 2x the RAM at replication time. Just something to
keep in mind.
-mike
On Mon, Jan 10, 2011 at 5:01 PM, Toke Eskildsen t...@statsbiblioteket.dkwrote:
On Mon, 2011-01-10 at 21:43 +0100, Paul wrote:
I
[x] ASF Mirrors (linked in our release announcements or via the Lucene
website)
[] Maven repository (whether you use Maven, Ant+Ivy, Buildr, etc.)
[x] I/we build them from source via an SVN/Git checkout.
[] Other (someone in your company mirrors them internally or via a
downstream project)
On
Could you make an additional date field, call it date_boost, that gets
populated in all of the cores EXCEPT the one with the newest articles, and
then boost on this field? Then when you move articles from the 'newest' core
to the rest of the cores you copy over the date to the date_boost field. (I
Check out Logg.ly. http://www.loggly.com/. They use SOLR to index all kinds
of logs, SOLR included. This is a paid service, so maybe not what you're
looking for. I've used it though, works great.
-Mike
On Sun, Jun 26, 2011 at 5:49 AM, Mr Havercamp mrhaverc...@gmail.com wrote:
I'm interested to
I am e-mailing to inquire about the status of the spellchecking component in
1.4 (distributed). I saw SOLR-785, but it is unreleased and for 1.5. Any
help would be much appreciated.
Thanks in advance,
Mike
Hi all,
I am e-mailing to inquire about the status of the spellchecking component in
1.4 (distributed). I saw SOLR-785, but it is unreleased and appears to be
for 1.5. Any help would be much appreciated.
Thanks in advance,
Mike
(sorry if this sent twice)
whoops, sorry guys
On Sat, Aug 8, 2009 at 12:37 PM, mike anderson saidthero...@gmail.comwrote:
Hi all,
I am e-mailing to inquire about the status of the spellchecking component
in 1.4 (distributed). I saw SOLR-785, but it is unreleased and appears to be
for 1.5. Any help would be much
Hi all,
I am e-mailing to inquire about the status of the spellchecking component in
1.4 (distributed). I saw SOLR-785, but it is unreleased and appears to be
for 1.5. Any help would be much appreciated.
Thanks in advance,
Mike
I set up the spell check component with this code in the config file:
searchComponent name=spellcheck class=solr.SpellCheckComponent
str name=queryAnalyzerFieldTypetextSpell/str
lst name=spellchecker
str name=nametitleCheck/str
str name=classnamesolr.IndexBasedSpellChecker/str
str
I'm trying to get MLT working in 1.4 distributed mode. I was hoping the
patch *SOLR-788 /jira/browse/SOLR-788 *would do the trick, but after
applying the patch by hand to revision 737810 (it kept choking on
component/MoreLikeThisComponent.java) I still get nothing. The URL I am
using is this:
/solrq=theory+of+colorful+graphsmlt.mintf=1mlt=true}
status=0 QTime=164
On Tue, Aug 18, 2009 at 11:30 AM, Grant Ingersoll gsing...@apache.orgwrote:
Are there errors in the logs?
-Grant
On Aug 18, 2009, at 10:42 AM, mike anderson wrote:
I'm trying to get MLT working in 1.4 distributed mode
PM, mike anderson saidthero...@gmail.comwrote:
There doesn't appear to be any related errors in the log. I've included it
below anyhow (there is a java.lang.NumberFormatException, i'm not sure what
that is).
thanks,
mike
for the query:
http://localhost:8983/solr/select?q=%22theory%20of
I'm kind of stumped by this one.. is it something obvious?
I'm running the latest trunk. In some cases the stopFilterFactory isn't
removing the field name.
Thanks in advance,
-mike
From debugQuery (both words are in the stopwords file):
the
problem is.
-mike
On Mon, Sep 14, 2009 at 1:10 AM, Yonik Seeley yo...@lucidimagination.comwrote:
That's pretty strange... perhaps something to do with your synonyms
file mapping for to a zero length token?
-Yonik
http://www.lucidimagination.com
On Mon, Sep 14, 2009 at 12:13 AM, mike anderson
Could this be related to SOLR-1423?
On Mon, Sep 14, 2009 at 8:51 AM, Yonik Seeley yo...@lucidimagination.comwrote:
Thanks, I'll see if I can reproduce...
-Yonik
http://www.lucidimagination.com
On Mon, Sep 14, 2009 at 2:10 AM, mike anderson saidthero...@gmail.com
wrote:
Yeah
38 matches
Mail list logo