On 12/29/2014 08:08 PM, ralph tice wrote:
Like all things it really depends on your use case. We have 160B
documents in our largest SolrCloud and doing a *:* to get that count takes
~13-14 seconds. Doing a text:happy query only takes ~3.5-3.6 seconds cold,
subsequent queries for the same terms
On 12/29/2014 09:53 PM, Jack Krupansky wrote:
And that Lucene index document limit includes deleted and updated
documents, so even if your actual document count stays under 2^31-1,
deleting and updating documents can push the apparent document count over
the limit unless you very aggressively
On 12/29/2014 10:30 PM, Toke Eskildsen wrote:
That being said, I acknowledge that it helps with stories to get a feel of what
can be done.
That's pretty much what I'm after, mostly to reassure myself that it can
be done. Even if it does require a lot of hardware (which is fine).
At
Hi folks!
I'm studying the migration process from our current solr 3.6 multitenant
cluster (single master, multiple slaves) setup to a solrcloud 4.10.3 but I
have a a question about the tlog.
First of all, I will try to give some context:
- 1 single master and N slaves.
- around 300
Hi,
In the query having lots of wildcard can we put a limitation on number of
expansion of terms done against a wildcard token something like
maxBooleanClauses?
Thanks,
Modassar
On Mon, Dec 29, 2014 at 11:15 AM, Modassar Ather modather1...@gmail.com
wrote:
Thanks Jack for your suggestions.
Shawn Heisey [apa...@elyograg.org] wrote:
I believe it would be useful to organize a session at Lucene Revolution,
possibly more interactive than a straight presentation, where users with
very large indexes are encouraged to attend. The point of this session
would be to exchange war stories,
On 12/30/2014 2:16 AM, Samuel García Martínez wrote:
I'm studying the migration process from our current solr 3.6 multitenant
cluster (single master, multiple slaves) setup to a solrcloud 4.10.3 but I
have a a question about the tlog.
First of all, I will try to give some context:
-
On 12/30/2014 4:16 AM, Modassar Ather wrote:
In the query having lots of wildcard can we put a limitation on number of
expansion of terms done against a wildcard token something like
maxBooleanClauses?
I'm not aware of anything for limiting wildcard terms, but I'm willing
to be surprised.
As
Please, tell a bit more about how you run SOLRs.
When we trying to run SOLR with 5 shards, 50GB per shard, we often get
OutOfMemory (especially for group queries). And while indexing SOLR often
falls (without exceptions - some JVM issue).
We are using Heliosearch.
--
View this message in
On 12/30/2014 5:43 AM, Toke Eskildsen wrote:
Shawn Heisey [apa...@elyograg.org] wrote:
I believe it would be useful to organize a session at Lucene Revolution,
possibly more interactive than a straight presentation, where users with
very large indexes are encouraged to attend. The point of
I actually did that once as a test years ago, as well as support for
paging through the wildcard terms with a starting offset, and it worked
great.
One way to think of the feature is as the ability to sample the values of
the wildcard. I mean, not all queries require absolute precision. Sometimes
bq: I did at some point try to write a long blog entry on Solr
hardware and setup for non-small corpuses, but have to give up:
Man, this makes me laugh! Oh the memories!
A common question from sales, quite a reasonable one at that; can we
have a checklist that we can use to give clients an idea
Thanks Erick!
Yes, if I set splitOnCaseChange=0, then of course it'll work -- but then
query for mixedCase will no longer also match mixed Case.
I think I want WDF to... kind of do all of the above.
Specifically, I had thought that it would allow a query for mixedCase
to match both/either
If people are so gung-ho to go down the lots on endless pain rabbit-hole
route by heavily under-configuring their clusters, I guess that's their
choice, but I would strongly advise against it. Sure, a small the few and
the proud warhorses can proudly proclaim how they did it, and a small
number of
Right, that's what I meant by WDF not being magic - you can configure it
to match any three out of four use cases as you choose, but there is no
choice that matches all of the use cases.
To be clear, this is not a bug in WDF, but simply a limitation.
-- Jack Krupansky
On Tue, Dec 30, 2014 at
I guess I don't understand what the four use cases are, or the three out
of four use cases, or whatever. What the intended uses of the WDF are.
Can you explain what the intended use of setting:
generateWordParts=1 catenateWords=1 splitOnCaseChange=1
Is that supposed to do something useful (at
On 30 December 2014 at 11:12, Jonathan Rochkind rochk...@jhu.edu wrote:
I'm a bit confused about what splitOnCaseChange combined with catenateWords
is meant to do at all. It _is_ generating both the split and single-word
tokens at query time
Have you tried only having WDF during indexing with
I bet that while there are no specific numbers, there are indicators
that everybody - who knows what they are doing - look at to decide
which particular aspect of configuration is hurting most.
So perhaps a good article would be not so much the concrete numbers
but the indicators to check. I
On 12/30/14 11:45 AM, Alexandre Rafalovitch wrote:
On 30 December 2014 at 11:12, Jonathan Rochkind rochk...@jhu.edu wrote:
I'm a bit confused about what splitOnCaseChange combined with catenateWords
is meant to do at all. It _is_ generating both the split and single-word
tokens at query time
I do have a more thorough discussion of WDF in my Solr Deep Dive e-book:
http://www.lulu.com/us/en/shop/jack-krupansky/solr-4x-deep-dive-early-access-release-7/ebook/product-21203548.html
You're not wrong about anything here... you just need to accept that WDF
is not magic and can't handle every
Hi,
Re. AND/OR boolean lookup for ‘infix’ suggestion. I checked that Lucene does
have an underlying support for this via the “allTermsRequired” boolean. However
this feature, along with highlighting (on/off) are currently hardwired in
Lucene, and hidden in Solr.
This issue has previously been
Okay, thanks. I'm not sure if it's my lack of understanding, but I feel
like I'm having a very hard time getting straight answers out of you
all, here.
I want the query mixedCase to match both/either mixed Case and
mixedCase in the index.
What configuration of WDF at index/query time would
You want preserveOriginal=“1”.
You should only do this processing at index time.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/
On Dec 30, 2014, at 9:33 AM, Jonathan Rochkind rochk...@jhu.edu wrote:
Okay, thanks. I'm not sure if it's my lack of understanding,
On 12/30/14 12:35 PM, Walter Underwood wrote:
You want preserveOriginal=“1”.
You should only do this processing at index time.
If I only do this processing at index time, then mixedCase at query
time will no longer match mixed Case in the index/source material.
I think I'm having trouble
On 12/30/14 12:42 PM, Jonathan Rochkind wrote:
On 12/30/14 12:35 PM, Walter Underwood wrote:
You want preserveOriginal=“1”.
You should only do this processing at index time.
If I only do this processing at index time, then mixedCase at query
time will no longer match mixed Case in the
On 12/30/2014 1:19 AM, Bram Van Dam wrote:
We had a look at Heliosearch a while ago and found it unsuitable. Seems
like they're trying to make use of some native x86_64 code and HotSpot
JVM specific features which we can't use. Some of our clients use IBM's
JVM so we're pretty much limited to
There are two approaches for the query “mixedCase” to match “mixed Case” in the
original document.
1. Add an index time synonym.
2. Add a ShingleFilterFactory to the index analysis chain.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/
On Dec 30, 2014, at 9:50
Thanks for the quick reply!
We just want to use solrcloud because it simplifies the operations process
and the cluster management like centralized configurations, replica
management and so on.
I've been playing with a 4 node cluster and watching the tlog and possibles
issues and it seems too
I'm running Solr 4.8 in a distributed environment (2 shards). I have added the
spellcheck component to my request handler. In my test system, which is not
distributed, it works. But when I move it to the Dev box, which is distributed,
2 shards, it is not working. Is there something additional I
Did you try the shards parameter? See:
https://cwiki.apache.org/confluence/display/solr/Spell+Checking#SpellChecking-DistributedSpellCheck
On Tue, Dec 30, 2014 at 2:20 PM, Charles Sanders csand...@redhat.com wrote:
I'm running Solr 4.8 in a distributed environment (2 shards). I have added
the
I've raised https://issues.apache.org/jira/browse/SOLR-6903 for this, as I
consider it a bug. Attached to the JIRA is a modified test demonstrating
the failure. The test fails on 5.x and 4.x.
Cheers,
-Brendan
On 30 December 2014 at 13:53, Brendan Humphreys bren...@canva.com wrote:
Thanks for
Thanks for the suggestion.
I did not do that originally because the documentation states:
This parameter is not required for the /select request handler.
Which is what I am using. But I gave it a go, even though I'm not certain of
the shard names. Now I have a NPE.
On 12/30/2014 5:03 PM, Charles Sanders wrote:
Thanks for the suggestion.
I did not do that originally because the documentation states:
This parameter is not required for the /select request handler.
Which is what I am using. But I gave it a go, even though I'm not certain of
the
For the initial release only JSON output format is supported with the
/export feature. Also there is no built-in distributed support yet. Both of
these features are likely to follow in future releases.
For the initial release you'll need a client that can handle the JSON
format and distributed
Mikhail,
How can I get a nightly build with fix for SOLR-5147 included. I've searched
and found that nightly build will not be available to the general public. Is
there any URL where they post their nightly build?
Thanks in advance
Rajesh Panneerselvam
From: Mikhail Khludnev [via Lucene]
Rajesh,
Nohow. Jira is still open, the patch wasn't committed anywhere.
On Wed, Dec 31, 2014 at 8:27 AM, Rajesh rajesh.panneersel...@aspiresys.com
wrote:
Mikhail,
How can I get a nightly build with fix for SOLR-5147 included. I've
searched and found that nightly build will not be available
Thanks Eric and Shawn.Here is why I am trying to do so.I may be missing
something here since this is relatively new to me.Appreciate your help and
time.* I will elaborate on what I am trying to acheive here.*
I am trying to install solr cloud and my machines typically have 5 drives
which are
Oh! Thanks Mikhail. But I could see a comment in that JIRA, above your comment
which is from Thomas champagne that the patch was committed to current trunk.
Is it not for this issue Mikhail?
Thanks in advance
Rajesh Panneerselvam
From: Mikhail Khludnev [via Lucene]
On 12/30/2014 11:44 PM, Rajesh wrote:
Oh! Thanks Mikhail. But I could see a comment in that JIRA, above your
comment which is from Thomas champagne that the patch was committed to
current trunk. Is it not for this issue Mikhail?
The message from Thomas Champagne indicates that he updated the
Is there a way to get the trunk and I can update the same patch to check this
functionality. If so, where can I get the trunk build?
--
View this message in context:
http://lucene.472066.n3.nabble.com/Join-in-SOLR-tp4173930p4176678.html
Sent from the Solr - User mailing list archive at
Rajesh,
it seems you need the trunk to apply patch on. my favorite way to do this
is
https://github.com/apache/lucene-solr/
Have a good hack!
On Wed, Dec 31, 2014 at 10:19 AM, Rajesh rajesh.panneersel...@aspiresys.com
wrote:
Is there a way to get the trunk and I can update the same patch to
On 12/31/2014 12:19 AM, Rajesh wrote:
Is there a way to get the trunk and I can update the same patch to check this
functionality. If so, where can I get the trunk build?
http://wiki.apache.org/solr/HowToContribute#Getting_the_source_code
You will need a number of software components,
42 matches
Mail list logo