bq: Is my understanding about stored fields correct, that even if excluded
from fl, the data on the disk for a given field would still be read as
part of decompression..
Assuming any stored field (NOT docvalues) was read then this is, indeed,
correct. To be pedantic about it, enough 16K blocks
On 13 January 2017 at 14:40, Shawn Heisey wrote:
> What if there were a schema option that would skip docValue retrieval
> for a field unless the fl parameter were to *explicitly* ask for that
> field? With a typical wildcard value in fl, fields with this option
> enabled
On 1/13/2017 1:02 PM, Erick Erickson wrote:
> What about using the defaults in requestHandlers along with SOLR-3191
> to accomplish this? Let's say that there was an fl-exclusion
> parameter. Now you'd be able to define an exclusion default that would
> exclude your field(s) unless overridden in
On 1/13/2017 5:46 PM, Chetas Joshi wrote:
> One of the things I have observed is: if I use the collection API to
> create a replica for that shard, it does not complain about the config
> which has been set to ReplicationFactor=1. If replication factor was
> the issue as suggested by Shawn,
Thank you so much Erik!
On Fri, Jan 13, 2017 at 4:40 PM, Erick Erickson
wrote:
> Here's what I'd do
> 1> create a new collection with a single shard
> 2> use the MERGEINDEXES core admin API command to merge the indexes
> from the old 2-shard collection
>
> That way you
Erick, I have not changed any config. I have autoaddReplica = true for
individual collection config as well as the overall cluster config. Still,
it does not add a replica when I decommission a node.
Adding a replica is overseer's job. I looked at the logs of the overseer of
the solrCloud but
Here's what I'd do
1> create a new collection with a single shard
2> use the MERGEINDEXES core admin API command to merge the indexes
from the old 2-shard collection
That way you have a chance to verify that the merged collection is OK
before deleting the old 2-shard collection.
On Fri, Jan 13,
Hi All,
I have a collection that has 2 shards. And I am finding that the 2 shards
are unnecessary.
So I would like to delete one of the shard without losing its data.
Illustration:
Before : Collection has shard1 and Shard 2
After: Collection No shard but the data contains Shard 1 and Shard 2
Joe Obernberger wrote:
[3 billion docs / 16TB / 27 shards on HDFS times 3 for replication]
> Each shard is then hosting about 610GBytes of index. The HDFS cache
> size is very low at about 8GBytes. Suffice it to say, performance isn't
> very good, but again, this
What about using the defaults in requestHandlers
along with SOLR-3191 to accomplish this? Let's
say that there was an fl-exclusion parameter. Now
you'd be able to define an exclusion default that
would exclude your field(s) unless overridden in your
request handler. This could be either a default
I've got an idea for a feature that I think could be very useful. I'd
like to get some community feedback about it, see whether it's worth
opening an issue for discussion.
First, some background info:
As I understand it, the fact that stored fields are compressed means
that even if a particular
In any case, this is really "the sizing question" and generic answers
are not reliable. Here's a long blog about why, but the net-net is
"prototype and measure". Fortunately you can prototype with just a few
nodes (I usually want at least 2 shards) and extrapolate reasonably
well.
As per Scott@FullStory you shall see benefits with many smaller shards then
few bigger. Also upgrading to Solr 6.2 would be better as there are many
improvements done handling multiple shards. See below presentation
Hi All - we've been experimenting with Solr Cloud 5.5.0 with a 27 shard
(no replication - each shard runs on a physical host) cluster on top of
HDFS. It currently just crossed 3 billion documents indexed with an
index size of 16.1TBytes. In HDFS with 3x replication this takes up
48.2TBytes.
The time functions aren't supported in the SQL interface currently.
Joel Bernstein
http://joelsolr.blogspot.com/
On Fri, Jan 13, 2017 at 10:44 AM, radha krishnan wrote:
> Hi,
>
> can we write an SQL statement and use the /sql handler to get the
> json.facet;s "gap"
Well, I've tried much larger values than 8, and it still doesn't seem to
do the job ?
For now, assume my users are searching for exact sub strings of a real
title.
Tom
On 13/01/17 16:22, Walter Underwood wrote:
I use a boost of 8 for title with no boost on the content. Both Infoseek and
I use a boost of 8 for title with no boost on the content. Both Infoseek and
Inktomi settled on the 8X boost, getting there with completely different
methodologies.
You might not want the title to completely trump the content. That causes some
odd anomalies. If someone searches for “ice age
Tom:
The output is numbing, but add =true to your query and you'll see
exactly what contributed to the score and why. Otherwise you're flying
blind. Obviously something's trumping your boosting, but you can't pin down
what without the numbers.
You can get an overall sense of what's happening if
Hi,
can we write an SQL statement and use the /sql handler to get the
json.facet;s "gap" functionality.
Ex facet query :
json.facet: {
my_histogram: {
type: range,
field: i_timestamp,
start: "2016-10-21T01:00:00Z",
end: "2016-10-21T02:00:00Z",
gap: "+1MINUTE",
mincount: 0
}
}
Thanks,
Hi Scott,
i have created a JIRA ticket (
https://issues.apache.org/jira/browse/SOLR-9962) . i will figure out the
patch process.
Thanks,
Radhakrishnan D
On Thu, Jan 12, 2017 at 8:57 AM, Scott Stults <
sstu...@opensourceconnections.com> wrote:
> Radhakrishnan,
>
> That would be an appropriate
I have a few hundred documents with title and content fields.
I want a match in title to trump matches in content. If I search for
"connected vehicle" then a news article that has that in the content
shouldn't be ranked higher than the page with that in the title is
essentially what I want.
Thanks @Toke, for pointing out these options. I'll have a read about
expungeDeletes.
Sounds even more so, that having solr filter out 0-counts is a good idea and I
should handle my use-case outside of solr.
Thanks again,
Sebastian
On Fri, 2017-01-13 at 14:19 +, Sebastian Riemer wrote:
>
Nice, thank you very much for your explanation!
>> Solr returns all fields as facet result where there was some value at
some time as long as the the documents are somewhere in the index, even when
they're marked as indexed. So there must have been a document with
m_mediaType_s=1. Even if
On Fri, 2017-01-13 at 14:19 +, Sebastian Riemer wrote:
> the second search should have been this: http://localhost:8983/solr/w
> emi/select?fq=m_mediaType_s:%221%22=on=*:*=0=0
> =json
> (or in other words, give me all documents having value "1" for field
> "m_mediaType_s")
>
> Since this
Then I don't understand your problem. Solr already does exactly what you
want.
Maybe the problem is different: I assume that there never was a value of
"1" in the index, leading to your confusion.
Solr returns all fields as facet result where there was some value at
some time as long as the the
Hi Bill,
Thanks, that's actually where I come from. But I don't want to exclude values
leading to a count of zero.
Background to this: A user searched for mediaType "book" which gave him 10
results. Now some other task/routine whatever changes all those 10 books to be
say 10 ebooks, because
Set mincount to 1
Bill Bell
Sent from mobile
> On Jan 13, 2017, at 7:19 AM, Sebastian Riemer wrote:
>
> Pardon me,
> the second search should have been this:
> http://localhost:8983/solr/wemi/select?fq=m_mediaType_s:%221%22=on=*:*=0=0=json
>
> (or in other words, give
Pardon me,
the second search should have been this:
http://localhost:8983/solr/wemi/select?fq=m_mediaType_s:%221%22=on=*:*=0=0=json
(or in other words, give me all documents having value "1" for field
"m_mediaType_s")
Since this search gives zero results, why is it included in the
Hi,
Please help me understand:
http://localhost:8983/solr/wemi/select?facet.field=m_mediaType_s=on=on=*:*=json
returns:
"facet_counts":{
"facet_queries":{},
"facet_fields":{
"m_mediaType_s":[
"2",25561,
"3",19027,
"10",1966,
"11",1705,
I just noticed why setting maxResultsForSuggest to a high value was not a good
thing. Because now it show spelling suggestions even on correctly spelled words.
I think, what I would need is the logic of SuggestMode.
SUGGEST_WHEN_NOT_IN_INDEX, but with a configurable limit instead of it being
Hi Alessandro,
Thanks for your explanation. It helped a lot. Although setting
"spellcheck.maxResultsForSuggest" to a value higher than zero was not enough. I
also had to set "spellcheck.alternativeTermCount". With that done, I now get
suggestions when searching for 'mycet' (a misspelling of
On Thu, 5 Jan 2017 16:31:35 +
Charlie Hull wrote:
> On 05/01/2017 13:30, Morten Bøgeskov wrote:
> >
> >
> > Hi.
> >
> > We've got a SolrCloud which is sharded and has a replication factor of
> > 2.
> >
> > The 2 replicas of a shard may look like this:
> >
> > Num Docs:
32 matches
Mail list logo