Hey All,
On a Solr server running 4.10.2 with three cores, two return the expected
info from /solr/admin/cores?wt=json but the third is missing userData and
lastModified.
The first (artists) and third (tracks) cores from the linked screenshot are
the ones I care about. Unfortunately, the third
/%3CCALyTvnpwZMj4zxPbK0abVpnyRJny=qauijdqmj7e3zgnv7u...@mail.gmail.com%3E
In the mean time, I'm still happy to hear any new thoughts / suggestions on
making similarity contiguous across upgrades.
Thanks again,
Aaron
On Tue, Jul 1, 2014 at 11:14 PM, Aaron Daubman daub...@gmail.com wrote:
In trying to determine some
In trying to determine some subtle scoring differences (causing
occasionally significant ordering differences) among search results, I
wrote a parser to normalize debug.explain.structured JSON output.
It appears that every score that is different comes down to a difference in
fieldNorm, where the
Hi Upayavira,
One small question - did you re-index in-between? The index structure
will be different for each.
Yes, the Solr 1.4.1 (working) instance was built using the original schema
and that solr version.
The Solr 3.6.1 (not working) instance was re-built using the new schema and
Solr
I forgot a possibly important piece... Given the different Solr versions,
the schema version (and it's related different defaults) is also a change:
Solr 1.4.1 Has:
schema name=ourSchema version=1.1
Solr 3.6.1 Has:
schema name=ourSchema version=1.5
Solr 1.4.1 Relevant Schema Parts - Working
Interestingly, I have run in to this same (or very similar) issue when
attempting to run embedded solr. All of the solr.* classes that were
recently moved to lucene would not work with the solr.* shorthand - I had
to replace them with the full classpath. As you found, these shorthands in
the same
Greetings,
I have several custom QueryComponents that have high one-time startup costs
(hashing things in the index, caching things from a RDBMS, etc...)
Is there a way to prevent solr from accepting connections before all
QueryComponents are ready?
Especially, since many of our instance are
, 2012 at 11:54 AM, Aaron Daubman daub...@gmail.com wrote:
Greetings,
I have several custom QueryComponents that have high one-time startup
costs
(hashing things in the index, caching things from a RDBMS, etc...)
Is there a way to prevent solr from accepting connections before all
(plus when I deploy, my deploy script
runs some actual simple test queries to ensure they return before enabling
the ping handler to return 200s) to avoid this problem.
What are you doing to programmatically disable/enable the ping handler?
This sounds like exactly what I should be doing as
Greetings,
We have a solr instance in use that gets some perhaps atypical queries
and suffers from poor (2 second) QTimes.
Documents (~2,350,000) in this instance are mainly comprised of
various descriptive fields, such as multi-word (phrase) tags - an
average document contains 200-400 phrases
Thanks for the ideas - some followup questions in-line below:
* use shingles e.g. to turn two-word phrases into single terms (how
long is your average phrase?).
Would this be different than what I was calling common grams? (other
than shingling every two words, rather than just common ones?)
Hi Peter,
Thanks for the recommendation - I believe we are thinking along the
same lines, but wanted to check to make sure. Are you suggesting
something different than my #5 (below) or are we essentially
suggesting the same thing?
On Wed, Oct 24, 2012 at 1:20 PM, Peter Keegan
Greetings,
I'm wondering if somebody would please explain why
SolrIndexSearcher.java enforces mutual exclusion of filter and
filterList
(e.g. see:
https://github.com/apache/lucene-solr/blob/trunk/solr/core/src/java/org/apache/solr/search/SolrIndexSearcher.java#L2039
)
For a custom application
Greetings,
In a recent batch of solr 3.6.1 slow response time queries the
profiler highlighted downHeap (line 212) in SoorerDocQueue.java as
averaging more than 60ms across the 16 calls I was looking at and
showing it spiking up over 100ms - which, after looking at the code
(two int
Hi Mikhail,
On Fri, Oct 5, 2012 at 7:15 AM, Mikhail Khludnev
mkhlud...@griddynamics.com wrote:
okay. huge rows value is no.1 way to kill Lucene. It's not possible,
absolutely. You need to rethink logic of your component. Check Solr's
FieldCollapsing code, IIRC it makes second search to achieve
On Fri, Oct 5, 2012 at 6:56 AM, Aaron Daubman daub...@gmail.com wrote:
Greetings,
I've been seeing this call chain come up fairly frequently when
debugging longer-QTime queries under Solr 3.6.1 but have not been able
to understand from the code what is really going on - the call graph
Greetings,
I've been seeing this call chain come up fairly frequently when
debugging longer-QTime queries under Solr 3.6.1 but have not been able
to understand from the code what is really going on - the call graph
and code follow below.
Would somebody please explain to me:
1) Why this would
Hi Yonik,
I've been attempting to fix the SUBREADER insanity in our custom
component, and have made perhaps some progress (or is this worse?) -
I've gone from SUBREADER to VALUEMISMATCH insanity:
---snip---
entries_count : 12
entry#0 :
Greetings,
I've recently moved to running some of our Solr (3.6.1) instances
using JDK 7u7 with the G1 GC (playing with max pauses in the 20 to
100ms range). By and large, it has been working well (or, perhaps I
should say that without requiring much tuning it works much better in
general than my
Greetings,
Is there a way to configure more graceful handling of field formatting
exceptions when indexing documents?
Currently, there is a field being generated in some documents that I
am indexing that is supposed to be a float but some times slips
through as an empty string. (I know, fix the
catch the error on the client, fix/clean/remove, and retry, no?
Otis
--
Search Analytics - http://sematext.com/search-analytics/index.html
Performance Monitoring - http://sematext.com/spm/index.html
On Mon, Sep 24, 2012 at 9:21 PM, Aaron Daubman daub...@gmail.com wrote:
Greetings
Yonik, et al.
I believe I found the section of code pushing me into 'insanity' status:
---snip---
int[] collapseIDs = null;
float[] hotnessValues = null;
String[] artistIDs = null;
try {
collapseIDs =
Hi all,
In reviewing a solr instance with somewhat variable performance, I
noticed that its fieldCache stats show an insanity_count of 1 with the
insanity type SUBREADER:
---snip---
insanity_count : 1
insanity#0 : SUBREADER: Found caches for descendants of
ReadOnlyDirectoryReader(segments_k
Hi Tomás,
This probably means that you are using the same field for faceting and for
sorting (tf_normalizedTotalHotttnesss), sorting uses the segment level
cache and faceting uses by default the global field cache. This can be a
problem because the field is duplicated in cache, and then it
Greetings,
I'm looking to add some additional logging to a solr 3.6.0 setup to
allow us to determine actual time spent by Solr responding to a
request.
We have a custom QueryComponent that sometimes returns 1+ MB of data
and while QTime is always on the order of ~100ms, the response time at
the
, Aaron Daubman daub...@gmail.com wrote:
Greetings,
I'm looking to add some additional logging to a solr 3.6.0 setup to
allow us to determine actual time spent by Solr responding to a
request.
We have a custom QueryComponent that sometimes returns 1+ MB of data
and while QTime is always
Robert,
I have a solr 1.4.1 instance and a solr 3.6.0 instance, both configured as
identically as possible (given deprecations) and indexing the same
document.
Why did you do this? If you want the exact same scoring, use the exact
same analysis.
This means specifying luceneMatchVersion =
Robert,
So this is lossy: basically you can think of there being only 256
possible values. So when you increased the number of terms only
slightly by changing your analysis, this happened to bump you over the
edge rounding you up to the next value.
more information:
Greetings,
I've been digging in to this for two days now and have come up short -
hopefully there is some simple answer I am just not seeing:
I have a solr 1.4.1 instance and a solr 3.6.0 instance, both configured as
identically as possible (given deprecations) and indexing the same document.
Greetings,
I'm wondering if anybody has experienced (and found root cause) for errors
like this. We're running Solr 3.6.0 with latest stable Jetty 7
(7.6.4.v20120524).
I know this is likely due to a client (or the server) terminating the
connection unexpectedly, but we see these fairly frequently
While I look into doing some refactoring, as well as creating some new
UpdateRequestProcessors (and/or backporting), would you please point me to
some reading material on why you say the following:
In this day and age, a custom update handler is almost never the right
answer to a problem -- nor
.
-- Jack Krupansky
-Original Message- From: Aaron Daubman
Sent: Saturday, June 09, 2012 12:03 AM
To: solr-user@lucene.apache.org
Subject: What would cause: SEVERE: java.lang.ClassCastException:
com.company.**MyCustomTokenizerFactory cannot be cast to
org.apache.solr.analysis
Hoss,
The new FieldValueSubsetUpdateProcessorFactory classes look phenomenal. I
haven't looked yet, but what are the chances these will be back-ported to
3.6 (or how hard would it be to backport them?)... I'll have to check out
the source in more detail.
If stuck on 3.6, what would be the best
Greetings,
I am in the process of updating custom code and schema from Solr 1.4 to
3.6.0 and have run into the following issue with our two custom Tokenizer
and Token Filter components.
I've been banging my head against this one for far too long, especially
since it must be something obvious I'm
=English protected=protwords.txt/
/analyzer
/fieldtype
---snip---
On Sat, Jun 9, 2012 at 12:03 AM, Aaron Daubman daub...@gmail.com wrote:
Greetings,
I am in the process of updating custom code and schema from Solr 1.4 to
3.6.0 and have run into the following issue with our two
Thanks for the responses,
By saying dirty data you imply that only one of the values is good or
clean and that the others can be safely discarded/ignored, as opposed to
true multi-valued data where each value is there for good reason and needs
to be preserved. In any case, how do you
Greetings,
I have dirty source data where some documents being indexed, although
unlikely, may contain multivalued fields that are also required for
sorting. In previous versions of Solr, sorting on this field worked fine
(possibly because few or no multivalued fields were ever encountered?),
Hoss,
: 1) Any recommendations on which best to sub-class? I'm guessing, for this
: scenario with rare batch puts and no evictions, I'd be looking for get
: performance. This will also be on a box with many CPUs - so I wonder if
the
: older LRUCache would be preferable?
i suspect you are
Greetings,
Has anybody gotten Solr 3.6.0 to work well with Jetty 7.6.3, and if so,
would you mind sharing your config files / directory structure / other
useful details?
Thanks,
Aaron
Greetings,
Following the directions here:
http://svn.apache.org/repos/asf/lucene/dev/trunk/dev-tools/maven/README.maven
for building Lucene/Solr with Maven, what is the correct -Dversion to pass
in to get-maven-poms.
This seems set up for building -SNAPSHOT, however, I would like to use
maven
before the usual
QueryComponent?
This component would be responsible for loading queries, executing them,
caching results, and for returning those results when these queries are
encountered later on.
Otis
From: Aaron Daubman daub...@gmail.com
Subject: Tips
Hoss, brilliant as always - many thanks! =)
Subclassing the SolrCache class sounds like a good way to accomplish this.
Some questions:
1) Any recommendations on which best to sub-class? I'm guessing, for this
scenario with rare batch puts and no evictions, I'd be looking for get
performance.
Greetings,
I'm looking for pointers on where to start when creating a
custom QueryCache.
Our usage patterns are possibly a bit unique, so let me explain the desired
use case:
Our Solr index is read-only except for dedicated periods where it is
updated and re-optimized.
On startup, I would like
43 matches
Mail list logo