Is there some special casing in the highlighter to skip query syntax words? The
words “and” and “or” don’t get highlighted.
This is in 6.5.0.
question
html
440
fastVector
1
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my
Yes, but it is better than nothing. Don’t let the unavailable perfect solution
keep you from implementing the available good solution.
If you want to easily use fuzzy search with edismax, check out the patch
submitted with SOLR-629.
wunder
Walter Underwood
wun...@wunderwood.org
http
I set up two suggesters, one fuzzy and one analyzing infix. That gives two sets
of suggestions, so the client code has to merge them into one list and toss
duplicates.
They use the same weights, so I can keep the top weighted suggestions.
wunder
Walter Underwood
wun...@wunderwood.org
http
OK. We’re going with a separate call to /suggest. For those of us with
controlled vocabularies, a suggest.distrib would be a handy thing.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Jun 22, 2017, at 4:32 PM, Alessandro Benedetti <a.
suggesters anyway, both fuzzy and infix.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Jun 6, 2017, at 4:34 AM, Rick Leir <rl...@leirtech.com> wrote:
>
>> typeahead solutions using a separate collection
>
> Erik, Do you u
I really don’t understand [1]. I read the JavaDoc for that, but how does it
help? What do I put in the solrconfig.xml?
I’m pretty good at figuring out Solr stuff. I started with Solr 1.2.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Jun
set the distrib default in the suggester
component instead of in the request handler?
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
page.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
tps://en.wikipedia.org/robots.txt>
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Jun 1, 2017, at 4:58 PM, Mike Drob <md...@apache.org> wrote:
>
> Isn't this exactly what Apache Nutch was built for?
>
> On Thu, Jun 1, 2017 at 6:5
Which was exactly what I suggested.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Jun 1, 2017, at 3:31 PM, David Choi <choi.davi...@gmail.com> wrote:
>
> In the mean time I have found a better solution at the moment is to t
licates, etc. The output of the
crawl goes to Solr. That is how we did it with Ultraseek (before Solr existed).
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Jun 1, 2017, at 3:01 PM, David Choi <choi.davi...@gmail.com> wrote:
>
Pretty sure that master/slave was in Solr 1.2. That was very nearly ten years
ago.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On May 26, 2017, at 9:52 AM, David Hastings <hastings.recurs...@gmail.com>
> wrote:
>
> Im curious
the freshness to Graphite. It is generally under 300 ms.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On May 24, 2017, at 12:51 PM, Webster Homer <webster.ho...@sial.com> wrote:
>
> Actually I wrote a service that calls the collecti
text queries. Students enter queries with hundreds of words (copy/paste),
but we truncate at 40 terms.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On May 24, 2017, at 12:33 PM, Nawab Zada Asad Iqbal <khi...@gmail.com> wrote:
>
>
reporting. Use 6.5.1.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On May 23, 2017, at 5:27 PM, Nawab Zada Asad Iqbal <khi...@gmail.com> wrote:
>
> Hi all,
>
> I am planning to upgrade my solr.4.x installation to a recent stable
&g
That was on Solr 1.3, so I’m pretty sure it was the whitespace tokenizer.
The synonym substitution for “+/-" was done in client code and indexing code,
outside of Solr. We also sanitized queries to remove all query syntax
characters.
wunder
Walter Underwood
wun...@wunderwood.org
uation. And everyone searched for "[•REC]²” as “rec2”. The
middot is supposed to be red. Movie studios are clueless about searchable
strings.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On May 23, 2017, at 10:41 AM, Erick Erickson <erickerick...
Look at all the bugs fixed or reported after 6.0.0. This might have been
reported and their might be a workaround.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On May 16, 2017, at 11:41 AM, Susheel Kumar <susheel2...@gmail.com> wrote
I would upgrade to 6.5.1 before doing anything else. 6.0.0 is more than a year
old.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On May 16, 2017, at 10:27 AM, Susheel Kumar <susheel2...@gmail.com> wrote:
>
> Also this is
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On May 10, 2017, at 8:49 AM, Karl-Philipp Richter <krich...@posteo.de> wrote:
>
> Hi,
>
> Am 10.05.2017 um 17:03 schrieb Walter Underwood:
>> I have contributed some answers
” answer, even if it is
wrong. Very
frustrating. This happens a lot with questions about antennas.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On May 10, 2017, at 7:47 AM, Erick Erickson <erickerick...@gmail.com> wrote:
>
> Personal
CDCR doesn’t rebuild it so much as copy it.
To change the schema, you’ll need to reindex.
I’ve worked on two NoSQL databases (Objectivity and MarkLogic) and I’ve worked
on Solr. They are utterly different designs, intended to do different things.
wunder
Walter Underwood
wun...@wunderwood.org
Which garbage collector are you using? The default GC will probably give long
pauses.
You need to use CMS or G1.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On May 8, 2017, at 8:48 AM, Erick Erickson <erickerick...@gmail.com> wrote
feature.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On May 3, 2017, at 3:53 PM, Rick Leir <rl...@leirtech.com> wrote:
>
> +Walter test it
>
> Jeff,
> How much CPU does the EC2 hypervisor use? I have heard 5% but that is
Hmm, has anyone measured the overhead of timeAllowed? We use it all the time.
If nobody has, I’ll run a benchmark with and without it.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On May 2, 2017, at 9:52 AM, Chris Hostetter <hossm
Might want to measure the single CPU performance of your EC2 instance. The last
time I checked, my MacBook was twice as fast as the EC2 instance I was using.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On May 1, 2017, at 6:24 PM, Chris Hostet
ho `date` ": 90th percentiles are $pct90"
echo `date` ": 95th percentiles are $pct95"
echo `date` ": full results are in ${test}"
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Apr 28, 2017, at 12:00 PM,
thing. Thank you SolrJ.
Our SLAs are for 95th percentile.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Apr 28, 2017, at 11:39 AM, Erick Erickson <erickerick...@gmail.com> wrote:
>
> Well, the best way to get no cache hits is t
More “unrealistic” than “amazing”. I bet the set of test queries is smaller
than the query result cache size.
Results from cache are about 2 ms, but network communication to the shards
would add enough overhead to reach 40 ms.
wunder
Walter Underwood
wun...@wunderwood.org
http
What is the message in the log when it crashes?
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Apr 27, 2017, at 10:10 AM, Vijay Kokatnur <kokatnur.vi...@gmail.com> wrote:
>
> We recently upgraded 4.5 index to 6.5 using IndexUpgra
”, and so on for other SRPs.
There were a few other filters, like G-rated movies or streaming, DVD, HD DVD,
or Bluray.
The full index was under 350K documents.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Apr 27, 2017, at 10:01 AM, Rick Leir
Also, 300,000 documents is fairly small for Solr. We handle a million queries
per day with a few servers on a collection that size.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Apr 26, 2017, at 10:33 PM, Walter Underwood <wun...@wunderwo
Do they have the same fields or different fields? Are they updated separately
or together?
If they have the same fields and are updated together, I’d put them in the same
collection. Otherwise, probably separate.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org
and then to host names
(drives me nuts). Same thing for scaling back, take it out of the load balancer
and shoot it.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Apr 25, 2017, at 9:23 AM, Erick Erickson <erickerick...@gmail.com&
Do a range search on that field with the desired date range. Request rows=0.
Compare the numFound to the total docs.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Apr 22, 2017, at 8:40 AM, Rick Leir <rl...@leirtech.com> wrote:
&g
Using a minimum score cut off does not work. The score is not an absolute
estimate of relevance.
The idf component of the score is a whole-corpus metric. When you add or delete
documents, the scores for the exact same query can change.
wunder
Walter Underwood
wun...@wunderwood.org
http
than the first hit for
query B.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Apr 21, 2017, at 9:35 AM, tstusr <ulfrhe...@gmail.com> wrote:
>
> Since we report the score, we think there will be some relation between them.
> As far
Sorry, that was formatted. The quotes are actually escaped, like this:
{"term":"microsoft office","weight":14,"payload":"{\"count\": 1534255,
\"id\": \"microsoft office\"}”}
wunder
Walter Underwood
wun...@wunderwood.org
quot;count": 1534255, "id": "microsoft office"}"
},
{
term: "microsoft excel",
weight: 13,
payload: "{"count": 940151, "id": "microsoft excel"}"
},
wunder
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.w
We recently needed multiple values in the payload, so I put a JSON blob in
there. It comes back as a string, so you have to decode that JSON separately.
Otherwise, it was a pretty clean solution.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On
. It is
still too easy for a good match to have a low score. We’re back to increasing
the good hits vs reducing the bad hits. You really only achieve one of those
two.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Apr 12, 2017, at 7:41 PM, Koji Sekigu
, there are scraps of info about beach parking
in multiple other pages. Fix the content.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Apr 12, 2017, at 11:44 AM, David Kramer <david.kra...@shoebuy.com> wrote:
>
> The idea is to no
Does the KeywordTokenizer make each value into a unitary string or does it take
the whole list of values and make that a single string?
I really hope it is the former. I can’t find this in the docs (including
JavaDocs).
wunder
Walter Underwood
wun...@wunderwood.org
http
all the ratios and stuff. When were running CMS, I set a size for
the heap and a size for the new space. Done. With G1, I don’t even get that
fussy.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Apr 11, 2017, at 8:22 PM, Shawn Heisey &
When I have done this, it is in multiple steps.
1. Change the indexing so that no data is going to that field.
2. Reindex, so the field is empty.
3. Remove the field from the schema.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Apr 11, 2
proportional to the number of
distinct terms in the index (the vocabulary). A rule of thumb is the vocabulary
is proportional to the square root of the number of terms in the index. Which
is often related to the number of documents. With this assumption, four shards
gives a 2X speedup. Which has wor
MarkMail is also good.
http://markmail.org/search/?q=solr-user#query:solr-user%20list%3Aorg.apache.lucene.solr-user+page:1+state:facets
<http://markmail.org/search/?q=solr-user#query:solr-user
list:org.apache.lucene.solr-user+page:1+state:facets>
wunder
Walter Underwood
wun...@wunderwo
http://stackoverflow.com/questions/33588262/tesseract-ocr-on-aws-lambda-via-virtualenv/35724894#35724894
<http://stackoverflow.com/questions/33588262/tesseract-ocr-on-aws-lambda-via-virtualenv/35724894#35724894>
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.or
Converting from PDF to text is embarrassingly parallel. You can throw as many
machines at it as you want. This is a great time to use a cloud computing
service. Need 1000 machines? No problem.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Mar
6.3.0. No idea how it is happening, but I got two replicas on the same host
after one host went down.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Mar 18, 2017, at 8:35 PM, Erick Erickson <erickerick...@gmail.com> wrote:
>
> Hm
rate Amazon EC2 instances, one JVM per instance, no rules, other than the
default.
"maxShardsPerNode":"1",
> bug #1 has been more or less of a pain for quite a while, work is ongoing
> there.
Glad to share our logs.
wunder
> FWIW,
> Erick
>
> On
equal traffic to each core without considering the host. Each
host should get equal traffic, not each core.
Bug #4 is putting two replicas from the same shard on one instance. That is
just asking for trouble.
When it works, this cluster is awesome.
wunder
Walter Underwood
wun...@wunderwood.org
That fails if Solr is not available.
To avoid dropping updates, you need some kind of persistent queue. We use
Amazon SQS for our incremental updates.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Mar 17, 2017, at 10:09 AM, OTH <
; org.apache.solr.util.SolrCLI$ZkCpTool; Could
not complete the zk operation for reason: KeeperErrorCode = ConnectionLoss for
/configs/tutors/solrconfig.xml
ERROR: KeeperErrorCode = ConnectionLoss for /configs/tutors/solrconfig.xml
wunder
Walter Underwood
wun...@wunderwood.org
http
I have a pretty good guess what happened. I requested a Zookeeper 3.4.6
cluster, but they built a 3.4.9 cluster.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Mar 15, 2017, at 4:38 PM, Walter Underwood <wun...@wunderwood.org&
. The largest file uploaded was 1094 bytes.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Mar 15, 2017, at 4:27 PM, Walter Underwood <wun...@wunderwood.org> wrote:
>
> Python kazoo can talk to zookeeper, uploading these same files. Solr
; zkClient has disconnected
ERROR: Error uploading file
/apps/solr6/server/solr/configsets/tutors/conf/schema.xml to zookeeper path
/configs/tutors/schema.xml
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
Also, upgrade to 6.4.2. There are serious performance problems in 6.4.0 and
6.4.1.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Mar 15, 2017, at 12:05 PM, Liu, Daphne <daphne@cevalogistics.com>
> wrote:
>
> For Solr 6.
, I recommend MarkLogic.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Mar 15, 2017, at 11:02 AM, rangeli nepal <rangeli.ne...@gmail.com> wrote:
>
> Thank you Erick for such a prompt reply. I am bit confused.
> Suppose I ha
President
Clinton vs the earlier President Clinton. Oops, that doesn’t work. Well, the
example used to work with “President Bush”, but now they are both pretty far in
the past.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Mar 13, 2017, at 10:21
00 MB is extremely large.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
master, then replicate.
5. When finished, stop sending updates to the old master and turn it off.
It is a hassle, but it is guaranteed to work.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Mar 13, 2017, at 5:48 AM, Shawn Heisey <apa...@elyogr
and it needs more performance work.
We are using New Relic for monitoring. That makes this sort of check very easy.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Mar 8, 2017, at 8:24 AM, Shawn Heisey <apa...@elyograg.org> wrote:
>
> On 3
, then do an async reload.
I’ve been thinking about time stamping the config directories so I can roll
back to a previous config if the reload fails.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Mar 7, 2017, at 12:47 PM, OTH <omer.t@gma
We are going to production this week using 6.3.0. We don’t have time to re-run
all the load benchmarks on 6.4.2.
We’ll qualify 6.4.2 in a couple of weeks, then upgrade prod if it passes.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Mar 6, 2
that are in your
content, synonyms are a better solution.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Mar 4, 2017, at 8:57 PM, Ryan Yacyshyn <ryan.yacys...@gmail.com> wrote:
>
> Hi everyone,
>
> I was thinking of using the Shingle
similar on the same host as PHP.
Connect to it locally, and let it pool connections to Solr. That will use
Unix-local connections that don’t actually run TCP.
Really, don’t try to fix networking inside PHP.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog
Make two denormalized collections. Just don’t join at query time.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Mar 3, 2017, at 1:01 AM, Preeti Bhat <preeti.b...@shoregrp.com> wrote:
>
> We can't, they are being used for different
Make one collection with denormalized data. This looks like a relational,
multi-table schema in Solr. That will be slow and painful.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Mar 2, 2017, at 9:55 PM, Preeti Bhat <preeti.b...@shoreg
, G1 collector). I recommend a smaller heap so the
OS can use that RAM to cache file buffers.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Mar 2, 2017, at 7:04 AM, Caruana, Matthew <mcaru...@icij.org> wrote:
>
> I’m curre
That is exactly what we do. The entire set of loaded documents is saved as
JSONL in S3. Very handy for loading up a prod index in test for diagnosis or
benchmarking.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Mar 1, 2017, at 8:14 AM, R
Since I always need to know which document was bad, I back off to batches of
one document when there is a failure.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Mar 1, 2017, at 6:25 AM, Erick Erickson <erickerick...@gmail.com> wrote:
I strongly recommend using OR instead of AND. Misspellings are in about 10% of
queries. Those tend to get zero results for many variations of AND or
mostly-AND.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Feb 28, 2017, at 11:54 AM, Nil
. This is a pretty vanilla solrconfig.xml.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Feb 27, 2017, at 6:44 PM, Erik Hatcher <erik.hatc...@gmail.com> wrote:
>
> `scores` (plural), you’ve got this below:
>
> Remove that, and li
I added that line because I was getting an error about it being undefined.
At this point, I’m just doing random shit hoping it will work. There is not
enough documentation to use this.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On
.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Feb 27, 2017, at 6:35 PM, Erik Hatcher <erik.hatc...@gmail.com> wrote:
>
> You have an empty “scores” parameter in there. You’re not showing your full
> search request, b
at
org.apache.solr.util.PropertiesUtil.substituteProperty(PropertiesUtil.java:65)
at org.apache.solr.util.DOMUtil.substituteProperties(DOMUtil.java:298)
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Feb 27, 2017, at 6:17 PM, Erik Hatcher <erik.hatc...@gmail.com&
of instances, because they will
do majority voting. Three or five is a good number. Three works with one
failure. Five works with one failure while you are upgrading the Zookeeper
ensemble.
We run a three node ensemble in test and a five node ensemble in prod.
wunder
Walter Underwood
wun
in the parameterized portion, like
this:
/handler?features=feature_a_4,feature_b_4,feature_c_4,feature_a_186,feature_b_186,feature
c_186
Right now, I can’t even make a solrconfig.xml that will load. I’ve read
everything I can find on params and function queries.
wunder
Walter Underwood
wun
Thanks!
Now I need to write up the mistakes I made trying to use the solr command.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Feb 24, 2017, at 11:17 AM, Erick Erickson <erickerick...@gmail.com> wrote:
>
> bq: Which mean
Running with this, which works they way we want.
/solr/data/${solr.core.name}
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Feb 24, 2017, at 10:08 AM, Walter Underwood <wun...@wunderwood.org> wrote:
>
> Dang it. I know better t
as Unix V7 (1979), with /var. It should be
easy to do in Solr.
I expected to see the shard names as directories under /solr/data, but I now
remember that I need to set that with a variable.
Time to delete everything and rebuild everything again.
wunder
Walter Underwood
wun...@wunderwood.org
http
The bug is that the dataDir is /solr/data and the index data is in
/apps/solr6/server/solr. Except for the suggest data. No index data should be
outside the dataDir, right?
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Feb 23, 2017, at 6:11
under@new-solr-c02.test3]#
Seems pretty broken to me.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
path has a newer (6.4+)
> solrj version but an older solr-core jar that cannot find this new
> method.
>
> On Sat, Feb 18, 2017 at 5:16 AM, Walter Underwood
> <walter.r.underw...@gmail.com> wrote:
>> Any idea why I would be getting this on a brand new, empty collectio
to reload whenever we need to, like loading
prod data in test or moving search to a different Amazon region.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Feb 21, 2017, at 7:34 PM, Erick Erickson <erickerick...@gmail.com> wrote:
>
> D
of breathing space above that. Not tons, because more old space garbage means
longer collections.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Feb 21, 2017, at 5:18 PM, Erick Erickson <erickerick...@gmail.com> wrote:
>
> Solr is very m
Awesome advice. flat=fast in Solr.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Feb 21, 2017, at 5:17 PM, Dave <hastings.recurs...@gmail.com> wrote:
>
> B is a better option long term. Solr is meant for retrieving
BM25.
Does host have enough RAM to hold most or all of the index in file buffers?
What are the hit rates on your caches?
Are you using fuzzy matches? N-gram prefix matching? Phrase matching? Shingles?
What version of Java are you running? What garbage collector?
wunder
Walter Underwood
wun
Use Solr 6.3.0. For us, 6.4.x is using about 10X as much CPU under heavy query
load.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Feb 20, 2017, at 5:11 AM, Michael Kuhlmann <k...@solr.info> wrote:
>
> This may be related to SOLR
/String;)V
at
org.apache.solr.update.TransactionLog.writeCommit(TransactionLog.java:457)
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
Yes, 512 MB is far too small. I’m surprised it even starts. We run with 8 Gb.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Feb 14, 2017, at 7:39 AM, Leon STRINGER <leon.strin...@ntlworld.com> wrote:
>
>>
>>On 1
Sorry. Haven’t used Windows since seven years ago and haven’t run Windows as a
server for more than a decade.
I would not recommend using Windows as your Solr OS. Windows is just not
designed for that.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog
ter (4 shards, 4-way replication factor) built with the c4.8xlarge
instances. I’m running 64 indexing threads and 1000 doc batches. It might go a
bit faster after we switch the cloud driver in SolrJ.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
>
Are you sure the server is faster? My MacBook Pro is a lot faster than many of
our Amazon EC2 servers.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Feb 13, 2017, at 8:12 PM, Zheng Lin Edwin Yeo <edwinye...@gmail.com> wrote:
>
&g
I’m seeing similar problems here. With 6.4.0, we were handling 6000
requests/minute. With 6.4.1 it is 1000 rpm with median response times around
2.5 seconds. I also switched to the G1 collector. I’m going to back that out
and retest today to see if the performance comes back.
wunder
Walter
hutting
down a node times out after three minutes and needs a kill. And collection
reload times out after three minutes.
Did not have this problem with 6.2.1.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
?
https://cwiki.apache.org/confluence/display/solr/Distributed+Requests
<https://cwiki.apache.org/confluence/display/solr/Distributed+Requests>
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Feb 9, 2017, at 10:13 AM, Walter Under
The default is “false”. I tried “true” and it fails because it can’t parse that
as an int.
The docs need to describe legal values for this.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
as much about New York, but it needs to be the best match for the query
“new york new york”.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Feb 9, 2017, at 5:18 AM, Ere Maijala <ere.maij...@helsinki.fi> wrote:
>
> Thanks Emir.
>
501 - 600 of 1642 matches
Mail list logo