doesn’t exist because it isn’t useful.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Feb 7, 2018, at 9:50 AM, bbarani <bbar...@gmail.com> wrote:
>
>
> I am trying to figure out a way to form boolean (||) query in SOLR.
> I
I think you need the feature in SOLR-629 that adds fuzzy to edismax.
https://issues.apache.org/jira/browse/SOLR-629
The patch on that issue is for Solr 4.x, but I believe someone is working on a
new patch.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog
Zookeeper 3.4.6 is not good? That was the version recommended by Solr docs when
I installed 6.2.0.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Feb 2, 2018, at 9:30 AM, Markus Jelsma <markus.jel...@openindex.io> wrote:
>
> Hel
do a search to see if a collection is ready. If a search for
“q=*:*=0” returns OK, then I’ll send traffic to that node.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Feb 1, 2018, at 8:35 AM, Erick Erickson <erickerick...@gmail.com&
a translation to “plus/minus” before indexing or querying.
Query completion made a huge difference, taking our clickthrough rate from 0.45
to 0.55.
Later, we added fuzzy search to handle misspellings.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog
a popularity score as a boost.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Jan 31, 2018, at 4:38 AM, Sravan Kumar <sra...@caavo.com> wrote:
>
> Hi,
> We are using solr for our movie title search.
>
>
> As it is
Use a filter query to filter out all the documents marked deleted.
Don’t use “expunge deletes”, it does more than you want because it forces a
merge. Just commit after sending the delete.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Jan
There is a nice table for all of the field options.
https://lucene.apache.org/solr/guide/7_2/field-properties-by-use-case.html
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Jan 17, 2018, at 11:23 PM, Clemens Wyss DEV <clemens...@mysign.ch&
to fetch blobs by ID and don’t want to use a filesystem, use
a database designed for that. That was the original focus of MySQL, for example.
Solr is not a database. Solr is not a repository. A design using Solr for
primary storage of data is a broken design.
wunder
Walter Underwood
wun
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Jan 3, 2018, at 8:58 AM, Erick Erickson <erickerick...@gmail.com> wrote:
>
> [I probably not need to do this because I have only one shard but I did
> anyway count was different.]
>
> Th
HTTPClient is non-blocking. Send the request, then the client gets control
back. It only blocks when you do the read. So one thread can send multiple
requests then check for each response.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Ja
Solr to keep up with ES in log search features. Likewise, I
would not expect ES to keep up with Solr for product and text search features.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Dec 27, 2017, at 1:33 PM, Erick Erickson <erickerick...@gma
sounds like a terrible idea. Use HTTP.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Dec 26, 2017, at 1:05 PM, Rick Leir <rl...@leirtech.com> wrote:
>
> Per,
> This is more of a question for the Drupal folks. But in passing,
makes the query much larger and much slower.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Dec 21, 2017, at 6:28 AM, Markus Jelsma <markus.jel...@openindex.io> wrote:
>
> Hello Steve,
>
> Well, that is an interesting approach to
affic with that.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
is faster, we’re handling double the query volume
with 3X the docs.
Sorry for the rant, but it has not been a good fall semester for our students
(customers).
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Dec 15, 2017, at 9:46 AM, Erick Erickson <eric
hamburger back into a cow. The PDF standard has improved a lot, but then you
get an OCR’ed PDF.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Dec 7, 2017, at 5:29 PM, Erick Erickson <erickerick...@gmail.com> wrote:
>
> I'm goin
s=0=2=true=jack+and+jill+are+maneuvering+a+2800+kg+boat+near+a+dock.+initially+the+boat%27s+position+is+m+and+its+speed+is+1.9+m%2Fs.+as+the+boa
In your case, “gettingstarted_shard1_replica_n2” should mean that is an
intra-cluster request. Also, “distrib=false” means it is for a single core.
wunder
W
Anybody have a favorite profiler to use with Solr? I’ve been asked to look at
why out queries are slow on a detail level.
Personally, I think they are slow because they are so long, up to 40 terms.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
with a local nginx server. That
will allow us to limit concurrent connections. It will also give us a log of
just the client requests.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Dec 5, 2017, at 4:25 AM, Matzdorf, Stefan, Springer SBM
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Dec 4, 2017, at 7:17 PM, Phil Scadden <p.scad...@gns.cri.nz> wrote:
>
> Thanks Eric. I have already followed the solrj indexing very closely - I have
> to do a lot of manipulation at indexi
flag was because something was invoking a full GC to get accurate
memory usage. That was annoying.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Dec 2, 2017, at 8:18 AM, Dominique Bejean <dominique.bej...@eolya.fr>
> wrote:
&g
We use an 8G heap and G1 with Shawn Heisey’s settings. Java 8, update 131.
This has been solid in production with a 32 node Solr Cloud cluster. We do not
do faceting.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Dec 2, 2017, at 7:43
Expanding the query to use both the tagged and untagged term might work. I’m
not sure the effect would be a lot different than boosting the preferred
language.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Nov 30, 2017, at 8:35 AM, Markus Jel
. If the entire
document is in one language, might as well use a filter query for that
language. The tags would work for multiple languages in one document.
Maybe make the untagged term a synonym. For cross-language terms like
“LaserJet”, the untagged one would have worse idf.
wunder
Walter
. Connections
are just blocks of data in the client and OS.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Nov 29, 2017, at 3:41 PM, Toke Eskildsen <t...@kb.dk> wrote:
>
> Walter Underwood <wun...@wunderwood.org> wrote:
>> I kn
.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Nov 29, 2017, at 8:38 AM, Toke Eskildsen <t...@kb.dk> wrote:
>
> Walter Underwood <wun...@wunderwood.org> wrote:
>> I set this in jetty.xml, but it
I’m pretty sure these OOMs are caused by uncontrolled thread creation, up to
4000 threads. That requires an additional 4 Gb (1 Meg per thread). It is like
Solr doesn’t use thread pools at all.
I set this in jetty.xml, but it still created 4000 threads.
wunder
Walter Underwood
wun
than the disease. We’ll run another load benchmark with thread max at
something realistic, like 200.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Nov 21, 2017, at 8:17 AM, Walter Underwood <wun...@wunderwood.org> wrote:
>
> All our
--module=http
I’m still confused why we are hitting OOM in 6.5.1 but weren’t in 6.3.0. Our
load benchmarks use prod logs. We added suggesters, but those use analyzing
infix, so they are search indexes, not in-memory.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my
he process goes to the bad place, then we
need to wait until someone is paged and kills it manually. Luckily, it usually
drops out of the live nodes for each collection and doesn’t take user traffic.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
Similarity is query time.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Nov 20, 2017, at 4:57 PM, Nawab Zada Asad Iqbal <khi...@gmail.com> wrote:
>
> Hi,
>
> I want to switch to Classic similarity instead of BM25 (default i
Thanks. I found this, which is much more clear than the manual.
http://www.openjems.com/solr-external-file-fields/
The Solr manual should include the info about how to declare the field.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Nov 17, 2
Do I need to define a field with when I use an external file field? I
see the to define it, but the docs don’t say how to define the
field.
The docs say that the file uses the fieldname as part of the filename, but the
directive defines a type name, not a field name. Right?
wunder
Walter
(orders) and other stuff that is currently
in Graphite. We’ll almost certainly move all that to InfluxDB and Grafana.
The Solr metrics were overloading the Graphite database, so we’re the first
service that is trying InfluxDB.
wunder
Walter Underwood
wun...@wunderwood.org
http
Look back down the string to my post. We use Grafana.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Nov 6, 2017, at 11:23 AM, Petersen, Robert (Contr)
> <robert.peters...@ftr.com> wrote:
>
> Interesting! Finally a Grafana use
HTTP
response codes.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Nov 2, 2017, at 9:30 AM, Petersen, Robert (Contr)
> <robert.peters...@ftr.com> wrote:
>
> OK I'm probably going to open a can of worms here... lol
>
>
>
nt 1.0 seems to perform better.
> But not sure why.
>
> I want try to add some relevant fields (tags, categories) in order to the
> have more chances to match the correct results.
>
> Best regards,
> Vincenzo
>
> On Tue, Oct 17, 2017 at 11:38 PM, Walter Underwood <wun.
eG1GC \
-XX:+ParallelRefProcEnabled \
-XX:G1HeapRegionSize=8m \
-XX:MaxGCPauseMillis=200 \
-XX:+UseLargePages \
-XX:+AggressiveOpts \
"
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Oct 18, 2017, at 10:32 PM, maximka19 <moldabeko...@gmail.com>
Linux/x64 (64-bit): 1024 KB
OS X (64-bit): 1024 KB
Oracle Solaris/i386 (32-bit): 320 KB
Oracle Solaris/x64 (64-bit): 1024 KB
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Oct 18, 2017, at 1:44 PM, Walter Underwood <wun...@wunderwood.org&
With an 8GB heap, I’d like to keep thread stack memory to 2GB or under, which
means a maxThreads of 1000.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Oct 18, 2017, at 1:41 PM, Walter Underwood <wun...@wunderwood.org> wrote:
&
Jetty maxThreads is set to 10,000 which seams way too big.
The comment suggests 5X the number of CPUs. We have 36 CPUs, which would mean
180 threads, which seems more reasonable.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Oct 16, 2017, at 1:53 AM, Emir Arnautović <emir.arnauto...@sematext.com>
> wrote:
>
> Hi Vincenzo,
> Unless you have really specific ranking requirements, I would not suggest you
I would not do this in Solr.
Post process the log file to split them out. That allows you to change the
definition of “slow” later, reprocess older files, etc.
Do log analysis with log analysis tools. Don’t try to push that too far up the
chain into the production server.
wunder
Walter
million/minute.
We are indexing bigger documents, but seeing 1 million/minute to a cluster with
four shards.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Sep 21, 2017, at 1:18 AM, Emir Arnautović <emir.arnauto...@sematext.com>
> wr
> On Sep 20, 2017, at 6:15 PM, Bill Oconnor <bocon...@plos.org> wrote:
>
> I restart using the standard "sudo service solr start/stop"
You might look into what that actually does.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
1578578283947098112 needs 61 bits. Is it being parsed into a 32 bit target?
That doesn’t explain where it came from, of course.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Sep 20, 2017, at 3:35 PM, Erick Erickson <erickerick...@gma
start -cloud -h `hostname`'
done
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Sep 20, 2017, at 1:42 PM, Bill Oconnor <bocon...@plos.org> wrote:
>
> Hello,
>
>
> Background:
>
>
> We have been successfully using Solr for o
As I understand it, any node in the cluster will direct the document to the
leader for the appropriate shard.
Works for us.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Sep 19, 2017, at 9:59 AM, David Hastings <hastings.recurs...@gma
Yes, good old HTTP.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Sep 19, 2017, at 9:54 AM, David Hastings <hastings.recurs...@gmail.com>
> wrote:
>
> Do you use HttpSolrClient then?
>
> On Tue, Sep 19, 2017 at 12:26 P
Cloud cluster get
the right docs to the right shards. That runs at 1 million docs/minute, so it
isn’t worth doing anything fancier.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Sep 19, 2017, at 9:05 AM, David Hastings <hastings.recurs...@gma
.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Sep 19, 2017, at 12:18 AM, Toke Eskildsen <t...@kb.dk> wrote:
>
> On Mon, 2017-09-18 at 20:47 -0700, shamik wrote:
>> I did bring down the heap size to 8gb, changed to G1 and
29G on a 30G machine is still a bad config. That leaves no space for the OS,
file buffers, or any other processes.
Try with 8G.
Also, give us some information about the number of docs, size of the indexes,
and the kinds of search features you are using.
wunder
Walter Underwood
wun
Millis=200 \
-XX:+UseLargePages \
-XX:+AggressiveOpts \
"
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Sep 18, 2017, at 7:24 AM, Shamik Bandopadhyay <sham...@gmail.com> wrote:
>
> Hi,
>
> I recently upgraded
How about doing that logic at index time? Make a new field, then copy into it
with that logic using an update request processor.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Sep 12, 2017, at 2:05 PM, Peter Kirk <p...@alpha-solutions.dk&
don’t know anything about
the ColdFusion API. I last looked at ColdFusion in 1997.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Sep 12, 2017, at 6:21 AM, Nick Way <n...@southeastpublishing.com> wrote:
>
> Thank you very mu
We have been running 6.5.1 in production since May. I would not run anything
before that.
The new metrics code caused performance problems. That was fixed in 6.5.0.
There was a memory leak talking to Zookeeper. That was fixed in 6.5.1.
Solr 6.6.1 should be released very soon.
wunder
Walter
.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Sep 7, 2017, at 5:02 PM, Erick Erickson <erickerick...@gmail.com> wrote:
>
> Skimming and to add to what Shawn said about ramBufferSizeMB.
>
> It's totally wasted space pretty
Use a multivalued field. Search for listOfIds:1. Or search for listOfIds:33.
This is one of the simplest things that Solr can do.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Sep 6, 2017, at 6:07 AM, Susheel Kumar <susheel2...@gmail.com&
This should probably be a feature of the analyzing infix suggester.
Right now, the fuzzy suggester is broken with the file dictionary, so we can’t
use fuzzy suggestions at all.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Sep 4, 2017, at 4
That is what I do. Use production logs. I have a JMeter script that sets a
constant request rate.
Before each load benchmark, I reload the collection to clear the caches, then
run 2000 warming queries from the logs. After that, I start the benchmark.
wunder
Walter Underwood
wun
params…
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Sep 1, 2017, at 2:01 PM, Erick Erickson <erickerick...@gmail.com> wrote:
>
> Shawn:
>
> See: https://issues.apache.org/jira/browse/SOLR-7219
>
> Try fq=filter(
CPU is available, etc.
We have one query that extracts 9 million documents from MySQL in about 20
minutes. We have another query on a different MySQL database that takes 90
minutes to get 7 million documents.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog
Think about making a denormalized view, with all the fields needed in one
table. That view gets sent to Solr. Each row is a Solr document.
It could be implemented as a view or as SQL, but that is a useful mental model
for people starting from a relational background.
wunder
Walter Underwood
That would be a really good reason for a 6.7.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Aug 28, 2017, at 8:48 AM, Markus Jelsma <markus.jel...@openindex.io> wrote:
>
> It is, unfortunately, not co
. Those
are designed to handle that.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Aug 24, 2017, at 1:05 PM, Angel Todorov <attodo...@gmail.com> wrote:
>
> Hello,
>
> So I can never have soft auto commit after each update ? Thi
I see a server with 100Gb of memory and processes (java and jsvc) using 203Gb
of virtual memory. Hmm.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Aug 18, 2017, at 12:05 PM, Joe Obernberger <joseph.obernber...@gmail.com>
> wrote:
&g
Why do you want to do this in Solr? This would be pretty easy in SQL. If you
want to sort, use a relational database.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Aug 18, 2017, at 2:52 AM, Luca Dall'Osto <tenacious...@yahoo.it.INVALID>
Just got a stack overflow in the Lucene automata code. Is there a way to save
out the FSA for a bug report?
This is in 6.5.1, so it may be related to
https://issues.apache.org/jira/browse/SOLR-9458
<https://issues.apache.org/jira/browse/SOLR-9458>
wunder
Walter Underwood
wun...@wunderwo
This might be a hack, but the CSV importer is really fast. Run the query in
your favorite command line and export to CSV, then load it.
You can even make batches. Maybe use ranges of the ID, then delete by query for
that range.
wunder
Walter Underwood
wun...@wunderwood.org
http
of how common those
skills are.
And for tf, a document tagged with both “new york” and “new york city” is not
twice as much about New York. Same for the movie “New York, New York”.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Aug 8, 2017, at 2:18
durability, but Solr is generally not considered to be durable across crashes
or “kill -9”.
https://en.wikipedia.org/wiki/ACID
Also, there is no explicit schema migration support. Schema changes usually
require a full reload from the repository.
wunder
Walter Underwood
wun...@wunderwood.org
”. To me, that means “not a database”.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Aug 5, 2017, at 4:59 AM, Dave <hastings.recurs...@gmail.com> wrote:
>
> And to add to the conversation, 7 year old blog posts are not a reason to
&
. Straightforward, but not easy to do it fast.
The “Inside MarkLogic Server” paper does a good job of explaining the guts.
Now, back to our regularly scheduled Solr presentations.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Aug 4, 2017, at 8:13 PM, Da
Solr is NOT a database. If you need a database, don’t choose Solr.
If you need both a database and search, choose MarkLogic.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Aug 4, 2017, at 4:16 PM, Francesco Viscomi <fvisc...@gmail.com&
I’m trying to get what I want out of the metrics reporting in Solr. I want the
counts and percentiles for each request handler in each collection. If I have
“/srp”, “/suggest”, and “/seo”, I want three sets of metrics.
I’m getting a lot of weird stuff. For counts for /srp in an eight node
How long are your GC pauses? Those affect all queries, so they make the 99th
percentile slow with queries that should be fast.
The G1 collector has helped our 99th percentile.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Aug 3, 2017, at 8:48
Nagle at Ford Aerospace. I
recommend his note “On Packet Switches with Infinite Storage” (1985) for the
full story. It is only eight pages long, but packed with goodness.
https://tools.ietf.org/html/rfc970
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
Way back in the 1.x days, replication was done with shell scripts and rsync,
right?
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Aug 1, 2017, at 2:45 PM, Shawn Heisey <apa...@elyograg.org> wrote:
>
> On 7/31/2017 12:28 PM, Ma
G1HeapRegionSize=8m \
-XX:MaxGCPauseMillis=200 \
-XX:+UseLargePages \
-XX:+AggressiveOpts \
“
Last week, I benchmarked the 4.x config handling 15,000 requests/minute with a
95th percentile response time of 30 ms, using production logs.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer
Are you sure you need a 100GB heap? The stall could be a major GC.
We run with an 8GB heap. We also run with Xmx equal to Xms, growing memory to
the max was really time-consuming after startup.
What version of Java? What GC options?
wunder
Walter Underwood
wun...@wunderwood.org
http
the shards are created to keep load and disk usage distributed. If you
want search to keep working after a failure, you will also need to create and
delete additional shards as replicas.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Jul 22, 2
to force RDBMS sharding onto Solr.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Jul 20, 2017, at 8:09 AM, rehman kahloon
> <mrehman_kahl...@yahoo.com.INVALID> wrote:
>
> blockquote, div.yahoo_quoted { margin-left: 0 !importan
If Apache is returning 400, then it really is a bad request. Debug the request
and fix it.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Jul 19, 2017, at 11:27 PM, mesenthil1
> <senthilkumar.arumu...@viacomcontractor.com> wr
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Jul 20, 2017, at 7:57 AM, Erick Erickson <erickerick...@gmail.com> wrote:
>
> Use the "implicit" router (being renamed "manual". that takes the
> value of a particular fie
A 400 would not be a failure to connect. A 400 means that the client is sending
a bad request.
Look at the Solr logs. Most likely, the document is invalid.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Jul 19, 2017, at 7:54 AM, Susheel Ku
ht": 5285,
"payload": ""
},
{
"term": "Chemistry",
"weight": 4548,
"payload": ""
},
{
"term": "Chemistry",
"weight": 3002,
"payload": ""
},
{
"term": "Intro
robustness by being too clever with the client.
Hacking the client is not a last choice, it is a bad choice.
For queries, there is not much benefit in running the cloud-aware client. A
regular load balancer works just about as well. We use the Amazon load
balancers.
wunder
Walter Underwood
wun
If your Zookeeper cluster is rebooting frequently, you have much, much worse
problems than client connections.
Is Zookeeper unstable in your installation? If so, fix that.
Stop hacking the client.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog
Optimize can take a long time.
Why are you doing an optimize? It doesn’t really optimize the index, it only
forces merges and deletions. Solr does that automatically in the background.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Jul 13, 2
times with three different parser packages on two engines. Never on Solr,
though.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Jul 10, 2017, at 12:40 PM, Allison, Timothy B. <talli...@mitre.org> wrote:
>
>> 4. Write a
4. Write an external program that fetches the file, fetches the metadata,
combines them, and send them to Solr.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Jul 9, 2017, at 3:03 PM, Giovanni De Stefano <giova...@servisoft.be> wrote:
&
The deleted records will be automatically cleaned up in the background. You
don’t have to do anything.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Jul 7, 2017, at 1:25 PM, calamita.agost...@libero.it wrote:
>
>
> Sorry , I kn
Solr machine we deploy, even in test
and dev, has 15 GB of RAM, SSD disks, and has an 8 GB Java heap. In prod, we
run with enough RAM that the entire index can live in RAM file buffers.
We don’t do a lot of faceting or other memory-intensive queries. We mostly just
search.
wunder
Walter
I would agree with removing the stopword filter from the example configs. It is
not a “best practice” or even a recommended practice.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Jun 29, 2017, at 8:01 PM, Rick Leir <rl...@leirtech.com&
My blog post has a list of movie titles. I forgot to list the TV series “Once
and Again”.
Some bands that are not searchable with stopwords:
* The Who
* Was (not Was)
* The The
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Jun 29, 2017, at 2
engines
on 16-bit machines. Neither disks nor RAM were big enough to hold the posting
lists for common words.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Jun 29, 2017, at 1:46 PM, Rick Leir <rl...@leirtech.com> wrote:
>
>
Setting lowercaseOperators=false for the request handler defaults fixes this.
Probably also fixes some relevance anomalies.
Thanks!
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Jun 29, 2017, at 6:38 AM, Shawn Heisey <apa...@elyograg.org&
Nope. Haven’t used stopwords for the last 20 years.
I wonder if lowercaseOperators is true. The docs don’t give the default value
for that in edismax.
https://lucene.apache.org/solr/guide/6_6/the-extended-dismax-query-parser.html
wunder
Walter Underwood
wun...@wunderwood.org
http
jectNames:once | (bookTitle_text:once)^4.0)
+((concept_ai_concepts_names_default:again)^2.0 | (question:again)^2.0 |
subjectNames:again | (bookTitle_text:again)^4.0)) ((bookTitle_text:\"once and
again\")^8.0 | (question:\"once and again\")^4.0 |
(concept_ai_concepts_names_defau
401 - 500 of 1642 matches
Mail list logo