have a
slice of the fields. Then separate Solr clusters could be used for each of
the slices.
-- Jack Krupansky
On Fri, Mar 20, 2015 at 7:12 AM, varun sharma
wrote:
> Requirements of the system that we are trying to build are for each date
> we need to create a SOLR index containing abo
Which query parser are you using? The dismax query parser does not support
wild cards or "*:*".
Either way, the error message is unhelpful - worth filing a Jira.
-- Jack Krupansky
On Fri, Mar 20, 2015 at 7:21 AM, Vishnu Mishra wrote:
> Hi, I am using solr 4.10.3 and doing dist
.
I think it's worth a Jira - text types should use language codes, not
country codes.
-- Jack Krupansky
On Tue, Mar 17, 2015 at 1:35 PM, Eduard Moraru wrote:
> Hi,
>
> First of all, a bit of a disclaimer: I am not a Czech language speaker, at
> all.
>
> We are using Sol
Great, glad to hear it!
One last question: What release of Solr are you using?
-- Jack Krupansky
On Tue, Mar 17, 2015 at 11:43 AM, Arsen wrote:
> Hello Jack,
>
> Jack, you made "my day" for me.
>
> Indeed, when I inserted space between "(" and "*:*
There was a Solr release with a bug that required that you put a space
between the left parenthesis and the "*:*". The edismax parsed query here
indicates that the "*:*" has not parsed properly.
You have "area", but in your jira you had a range query.
-- Jack Krupan
Oops... I said "StatsInfo" and that should have been "StatsCache"
("").
-- Jack Krupansky
On Fri, Mar 13, 2015 at 6:04 PM, Anshum Gupta
wrote:
> There's no rough formula or performance data that I know of at this point.
> About he guidance, if you wa
sted query term with
"\u0020".
-- Jack Krupansky
On Fri, Mar 13, 2015 at 2:37 AM, Rajesh
wrote:
> Hi,
>
> I want to retrieve the parent document which contain "Test Street" in
> street
> field or if any of it's child contain "Test Street" in
le now using Distributed IDF as their default?
I'm not currently using this, but the existing doc and Jira is too minimal
to offer guidance as requested above. Mostly I'm just curious.
Thanks.
-- Jack Krupansky
citly
registered (refer to SOLR-6792)*". IOW, remove the XML
element from your solrconfig.
As far as the document analysis request handler, that should still be fine.
Are you encountering some problem? The first log line you gave is just an
INFO - information only, not a problem.
-- Jack Krupans
just trying to match
the product name and availability.
-- Jack Krupansky
On Tue, Mar 3, 2015 at 4:51 PM, Tom Devel wrote:
> Hi,
>
> I am running Solr 5.0.0 and have a question about proximity search and
> multiValued fields.
>
> I am indexing xml files of the following form
You could simply hash the value before sending it to Solr and then hash the
user query before sending it to Solr as well. Do you need or want only
exact matches, or do you need keyword search, wildcards, etc?
-- Jack Krupansky
On Fri, Feb 27, 2015 at 4:38 PM, Alexandre Rafalovitch
wrote
Most of the magic is done internal to the query parser which actually
inspects the index analyzer chain when a leading wildcard is present. Look
at the parsed_query in the debug response, and you should see that special
prefix query.
-- Jack Krupansky
On Thu, Feb 26, 2015 at 3:49 PM, jaime
Please post your field type... or at least confirm a comparison to the
example in the javadoc:
http://lucene.apache.org/solr/4_10_3/solr-core/org/apache/solr/analysis/ReversedWildcardFilterFactory.html
-- Jack Krupansky
On Thu, Feb 26, 2015 at 2:38 PM, jaime spicciati
wrote:
> All,
>
s the qt.shards
parameter as suggested, to re-emphasize to people that if they want to use
a custom handler in distributed mode, then they will most likely need this
parameter.
-- Jack Krupansky
On Thu, Feb 26, 2015 at 11:28 AM, Mikhail Khludnev <
mkhlud...@griddynamics.com> wrote:
> Hel
.
Please confirm which doc you were reading for the tutorial steps.
-- Jack Krupansky
On Thu, Feb 26, 2015 at 6:17 AM, rupak wrote:
> Hi,
>
> I am new in Solr and using Solr 5.0.0 search server. After installing when
> I’m going to search any keyword in solr 5.0.0 it dose not give any re
Solr also now has a schema API to dynamically edit the schema without the
need to manually edit the schema file:
https://cwiki.apache.org/confluence/display/solr/Schema+API#SchemaAPI-AddaDynamicFieldRule
-- Jack Krupansky
On Wed, Feb 25, 2015 at 3:15 PM, Vishal Swaroop
wrote:
> Thanks a
As a general proposition, your first stop with any query interpretation
questions should be to add the debigQuery=true parameter and look at the
parsed_query in the query response which shows how the query is really
interpreted.
-- Jack Krupansky
On Wed, Feb 25, 2015 at 8:21 AM, wrote:
>
It's a string field, so there shouldn't be any analysis. (read back in the
thread for the field and field type.)
-- Jack Krupansky
On Tue, Feb 24, 2015 at 3:19 PM, Alexandre Rafalovitch
wrote:
> What happens if the query does not have wildcard expansion (*)? If the
> behavior
u provided
in this thread.
-- Jack Krupansky
On Tue, Feb 24, 2015 at 2:35 PM, Arun Rangarajan
wrote:
> Exact query:
> /select?q=raw_name:beyonce*&wt=json&fl=raw_name
>
> Response:
>
> { "responseHeader": {"status": 0,"QTime": 0,
Please post the info I requested - the exact query, and the Solr response.
-- Jack Krupansky
On Tue, Feb 24, 2015 at 12:45 PM, Arun Rangarajan
wrote:
> In our case, the lower-casing is happening in a custom Java indexer code,
> via Java's String.toLowerCase() method.
>
> I
eyword
tokenizer and then filter it for lower case, such as when the user query
might have a capital "B". String field is most appropriate when the field
really is 100% raw.
-- Jack Krupansky
On Mon, Feb 23, 2015 at 7:37 PM, Arun Rangarajan
wrote:
> Yes, it is a string field and not
Is it really a string field - as opposed to a text field? Show us the field
and field type.
Besides, if it really were a "raw" name, wouldn't that be a capital "B"?
-- Jack Krupansky
On Mon, Feb 23, 2015 at 6:52 PM, Arun Rangarajan
wrote:
> I have a string fi
It's never helpful when you merely say that it "did not work" - detail the
symptom, please.
Post both the query and the response. As well as the field and type
definitions for the fields for which you expected term vectors - no term
vectors are enabled by default.
-- Jack Krupans
he edismax query parser has a few too many parsing heuristics, causing way
too many odd combinations that are not exhaustively tested.
-- Jack Krupansky
On Sat, Feb 21, 2015 at 5:43 PM, Tang, Rebecca
wrote:
> Hi there,
>
> I have a field pg_int which is number of pages stored as intege
Please provide a few examples that illustrate your requirements.
Specifically, requirements that are not met by the existing Solr stemming
filters. What is your specific goal?
-- Jack Krupansky
On Wed, Feb 18, 2015 at 10:50 AM, dinesh naik
wrote:
> Hi,
> IS there a way to achieve lemmati
ueries with operators and the case of a leading or trailing
stopword. The old Lucid query parser did have better support for queries
with stop words, but that's no longer available in their current product.
-- Jack Krupansky
On Mon, Feb 16, 2015 at 8:16 PM, Alexandre Rafalovitch
wrote:
time
when they are not at either end of the query. This way, queries such as "to
be or not to be", "vitamin a", and "the office" can still provide
meaningful and precise matches even as stop words are generally ignored.
-- Jack Krupansky
On Mon, Feb 16, 2015 at 4
t in invariants, but also in the actual request, which is a
contradiction in terms - what is your actual intent? This isn't the cause
of the exception, but does raise questions of what you are trying to do.
4. Why don't you have a q parameter for the actual query?
-- Jack Krupansky
On
oss users, so a given query is likely to have been queried
recently by another user.
-- Jack Krupansky
On Sat, Feb 14, 2015 at 3:39 PM, jaime spicciati
wrote:
> All,
> This is my current understanding of how SolrCloud load balancing works...
>
> Within SolrCloud, for a cluster with more
There is no recommendation built into Solr itself, but you might get some
good ideas from this presentation:
http://www.slideshare.net/treygrainger/building-a-real-time-solrpowered-recommendation-engine
-- Jack Krupansky
On Fri, Feb 13, 2015 at 8:33 AM, wrote:
> Sir ,
>I need to kno
that soft commit waits for background merges! (Hoss??)
-- Jack Krupansky
On Fri, Feb 13, 2015 at 4:47 PM, Otis Gospodnetic <
otis.gospodne...@gmail.com> wrote:
> Check
> http://search-lucene.com/?q=commit+wait+block&fc_type=mail+_hash_+user
>
> e.g. http://search-
tenant has their own app and the service provider controls the Solr
server but has no control over the app or load.
The first is supported by Solr. The second is not, other than the service
provider spinning up separate instances of Solr on separate physical
servers.
-- Jack Krupansky
On Thu
this front?
-- Jack Krupansky
On Wed, Feb 11, 2015 at 8:05 AM, Erick Erickson
wrote:
> bq: Are there any such structures?
>
> Well, I thought there were, but I've got to admit I can't call any to mind
> immediately.
>
> bq: 2b is just the hard limit
>
> Yeah,
l not be a matter of how many documents you can load, but
whether the query response latency for those documents is sufficient.
-- Jack Krupansky
On Wed, Feb 4, 2015 at 4:54 PM, Arumugam, Suresh
wrote:
> Hi All,
>
>
>
> We are trying to load 14+ Billion documents into Solr. But we a
The Solr properties can also be defined in solrcore.properties and
core.properties files:
https://cwiki.apache.org/confluence/display/solr/Configuring+solrconfig.xml
-- Jack Krupansky
On Tue, Feb 3, 2015 at 3:31 PM, O. Olson wrote:
> Thank you Jim. I was hoping if there is an alternative
Sorry, that feature is not available in Solr at this time. You could
implement an update processor which copied only the desired input field
values. This can be done in JavaScript using the script update processor.
-- Jack Krupansky
On Mon, Feb 2, 2015 at 2:53 AM, danny teichthal wrote:
>
need to be able to handle.
-- Jack Krupansky
On Wed, Jan 28, 2015 at 5:56 AM, thakkar.aayush
wrote:
> I have around 1 million job titles which are indexed on Solr and am looking
> to improve the faceted search results on job title matches.
>
> For example: a job search for *Resear
Take a look at the RegexTransformer. Or,in some cases your may need to use
the raw ScriptTransformer.
See:
https://cwiki.apache.org/confluence/display/solr/Uploading+Structured+Data+Store+Data+with+the+Data+Import+Handler
-- Jack Krupansky
On Sat, Jan 24, 2015 at 3:49 PM, Carl Roberts wrote
How are you currently importing data?
-- Jack Krupansky
On Sat, Jan 24, 2015 at 3:42 PM, Carl Roberts wrote:
> Sorry if I was not clear. What I am asking is this:
>
> How can I parse the data during import to tokenize it by (:) and strip the
> cpe:/o?
>
>
>
> On 1/2
which treated the colons as token separators.
-- Jack Krupansky
On Sat, Jan 24, 2015 at 3:28 PM, Alexandre Rafalovitch
wrote:
> You are using keywords here that seem to contradict with each other.
> Or your use case is not clear.
>
> Specifically, you are saying you are getting s
or maybe use a Solr update processor to pull the
string apart and store the individual pieces as separate fields.
As always, the first question is not how to store your data, but how your
users intend to access your data. Post some sample queries. I imagine that
any sane user would like to refere
That's phone the filter is doing - transforming text into phonetic codes at
index time. And at query time as well to do the phonetic matching in the
query. The actual phonetic codes are stored in the index for the purposes
of query matching.
-- Jack Krupansky
On Fri, Jan 23, 2015 at 12:
/org/apache/solr/handler/FieldAnalysisRequestHandler.html
and in solrconfig.xml
-- Jack Krupansky
On Thu, Jan 22, 2015 at 8:42 AM, Amit Jha wrote:
> Hi,
>
> I need to know how can I retrieve phonetic codes. Does solr provide it as
> part of result? I need codes for record matching.
&g
Presence of a wildcard in a query term is detected by the traditional Solr
and edismax query parsers and causes normal term analysis to be bypassed.
As I said, wildcards are a specific feature that dismax specifically
doesn't support - this has nothing to do with edismax.
-- Jack Krupansk
The dismax query parser does not support wildcards. It is designed to be
simpler.
-- Jack Krupansky
On Thu, Jan 22, 2015 at 5:57 PM, Jorge Luis Betancourt González <
jlbetanco...@uci.cu> wrote:
> I was also suspecting something like that, the odd thing was that the with
> the dismax
Solr
tried to find the remaining terms in the default query field.
-- Jack Krupansky
On Thu, Jan 22, 2015 at 5:47 PM, Carl Roberts wrote:
> Hi,
>
> How do you query a sentence composed of multiple words in a description
> field?
>
> I want to search for sentence "Oracle Fusi
The problem is that the presence of a wildcard causes Solr to skip the
usual token analysis. But... you could add a "multiterm" analyzer, and then
the wildcard would just get treated as punctuation.
-- Jack Krupansky
On Thu, Jan 22, 2015 at 4:33 PM, Jorge Luis Betancourt González &
It sounds like your app needs a lot more RAM so that it is not doing so
much I/O.
-- Jack Krupansky
On Tue, Jan 20, 2015 at 9:24 AM, Nimrod Cohen wrote:
> Hi
>
> I done some performance test, and I wanted to know if any one saw the same
> behavior.
>
>
>
> We need to
to do customization, entity extraction, boiler-plate removal, etc. in
app-friendly code, before transport to the Solr server.
The extraction request handler is a really cool feature and quite
sufficient for a lot of scenarios, but additional architectural flexibility
would be a big win.
-- Jack
admittedly, it's moot if
stats is eventually to be superseded by the analytics component.
-- Jack Krupansky
On Wed, Jan 14, 2015 at 12:26 PM, Chris Hostetter
wrote:
>
> : Does anybody know for sure whether the stats component fully supports
> : distributed mode? It is listed in
ow the new analytics component doesn't support distributed mode, but my
question is about the old "stats" component.
-- Jack Krupansky
It's what Java has, whatever that is:
http://docs.oracle.com/javase/7/docs/api/java/util/regex/Pattern.html
So, maybe the correct answer is neither, but similar to both.
-- Jack Krupansky
On Wed, Jan 14, 2015 at 9:06 AM, tomas.kalas wrote:
> Oh yeah, that is it. Thank you very much
I was suspecting it might do that - the pattern is "greedy" and takes the
longest matching pattern. Add a question mark after the asterisk to use
stingy mode that matches the shortest pattern.
-- Jack Krupansky
On Wed, Jan 14, 2015 at 8:37 AM, tomas.kalas wrote:
> I just used Sol
It should replace all occurrences of the pattern. Post your specific filter
XML. Patterns can be very tricky.
Use the Solr Admin UI analysis page to see how the filtering is occurring.
-- Jack Krupansky
On Wed, Jan 14, 2015 at 7:16 AM, tomas.kalas wrote:
> Jack, thanks for help, but if i u
umber of unique row sets.
-- Jack Krupansky
On Tue, Jan 13, 2015 at 4:29 PM, tedsolr wrote:
> I have a complicated problem to solve, and I don't know enough about
> lucene/solr to phrase the question properly. This is kind of a shot in the
> dark. My requirement is to return searc
s only . You can use a second pattern char filter to remove
the "<[/]d[12>" markers as well, probably changing them to a space in both
cases.
See:
http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/pattern/PatternReplaceCharFilterFactory.html
-- Jack K
ipt update processors, see my Solr e-book:
http://www.lulu.com/us/en/shop/jack-krupansky/solr-4x-deep-dive-early-access-release-7/ebook/product-21203548.html
-- Jack Krupansky
On Tue, Jan 13, 2015 at 9:21 AM, tomas.kalas wrote:
> Thanks Jack for your advice. Can you please explain me little
A function query or an update processor to create a separate field are
still your best options.
-- Jack Krupansky
On Tue, Jan 13, 2015 at 4:18 AM, Ali Nazemian wrote:
> Dear Markus,
>
> Unfortunately I can not use payload since I want to retrieve this score to
> each user as a
That's your job. The easiest way is to do a copyField to a "string" field.
-- Jack Krupansky
On Tue, Jan 13, 2015 at 7:33 AM, Naresh Yadav wrote:
> *Schema :*
>
>
> *Code :*
> SolrQuery q = new SolrQuery().setQuery("*:*");
> q.set(GroupParams.GR
Could you clarify what you mean by "Lucene reverse index"? That's not a
term I am familiar with.
-- Jack Krupansky
On Mon, Jan 12, 2015 at 1:01 AM, Ali Nazemian wrote:
> Dear Jack,
> Thank you very much.
> Yeah I was thinking of function query for sorting, but I have to
Won't function queries do the job at query time? You can add or multiply
the tf*idf score by a function of the term frequency of arbitrary terms,
using the tf, mul, and add functions.
See:
https://cwiki.apache.org/confluence/display/solr/Function+Queries
-- Jack Krupansky
On Sun, Jan 11,
detect some common use cases and handle them
specially in your client. Such as the example you gave - you could extract
the terms and generate separate bq parameters.
-- Jack Krupansky
On Sun, Jan 11, 2015 at 1:28 PM, Michael Lackhoff
wrote:
> Am 11.01.2015 um 18:30 schrieb Jack Krupan
client or app layer code, then maybe
you just need to put more intelligence into that query-generation code in
the client.
-- Jack Krupansky
On Sun, Jan 11, 2015 at 12:08 PM, Michael Lackhoff
wrote:
> Hi Ahmet,
>
> > You might find this useful :
> > https://lucidworks.com/blog/
than this optimize operation?
-- Jack Krupansky
On Sun, Jan 11, 2015 at 1:46 AM, ig01 wrote:
> Thank you all for your response,
> The thing is that we have 180G index while half of it are deleted
> documents.
> We tried to run an optimization in order to shrink index size but it
ities/TFIDFSimilarity.html
And to use your custom similarity class in Solr:
https://cwiki.apache.org/confluence/display/solr/Other+Schema+Elements#OtherSchemaElements-Similarity
-- Jack Krupansky
On Sun, Jan 11, 2015 at 9:04 AM, Ali Nazemian wrote:
> Hi everybody,
>
> I am going to add some analy
ot;required".) So, please
explain in plain English what effect you are trying to achieve. mm is not
for newbies!
Also, please point us to whatever doc or other material you were reading
that gave you the impression that mm was appropriate for your use case, so
that we can correct any bad documen
the
server rather than optimize performance.
-- Jack Krupansky
On Sat, Jan 10, 2015 at 6:02 AM, SolrUser1543 wrote:
> Would it be a good solution to index single document instead of bulk ?
> In this case I will know about the status of each message .
>
> What is recommendation
Correct, Solr clearly needs improvement in this area. Feel free to comment
on the Jira about what options you would like to see supported.
-- Jack Krupansky
On Sat, Jan 10, 2015 at 5:49 AM, SolrUser1543 wrote:
> From reading this (https://issues.apache.org/jira/browse/SOLR-445) I see
>
uot;expert" feature. And there should be doc
on how to use it.
I do have some doc in my e-book, with some examples, but even that does not
show the complete end-to-end config and schema.
-- Jack Krupansky
On Sat, Jan 10, 2015 at 1:13 AM, Alexandre Rafalovitch
wrote:
> So, Query Parser does
that the field type
uses the reversed wildcard filter, and then it generates a wildcard query
that using the reversed query token and wildcard pattern so that the
leading wildcard becomes a trailing wildcard or prefix query
-- Jack Krupansky
On Fri, Jan 9, 2015 at 3:15 PM, Alexandre Rafalovitch
Consider an update processor - it can take any input, break it up any way
you want, and then output multiple field values.
You can even us the stateless script update processor to write the logic in
JavaScript.
-- Jack Krupansky
On Fri, Jan 9, 2015 at 6:47 AM, tomas.kalas wrote:
> Hello
table
performance for both indexing and a full range of queries, and then use 10x
that RAM for the RAM for the 100% load. That's the OS system memory for
file caching, not the total system RAM.
-- Jack Krupansky
On Thu, Jan 8, 2015 at 4:55 PM, Nishanth S wrote:
> Thanks guys for your inpu
mean there will be
a reduction in the amount of system memory needed for file caching of the
Lucene index. 100 / 4 * 2.8GB = 70 GB of RAM needed on each server.
-- Jack Krupansky
On Thu, Jan 8, 2015 at 10:57 AM, Andrew Butkus <
andrew.but...@c6-intelligence.com> wrote:
> Hi Shawn,
>
number of CPU cores?
-- Jack Krupansky
On Wed, Jan 7, 2015 at 9:14 PM, Nishanth S wrote:
> Thanks Shawn and Walter.Yes those are 12,000 writes/second.Reads for the
> moment would be in the 1000 reads/second. Guess finding out the right
> number of shards would be my starting point.
&
cores/tenants.
Will tenants be directly accessing Solr, or will you provide them with a
REST API for an application layer that intermediates access to Solr?
-- Jack Krupansky
On Wed, Jan 7, 2015 at 4:31 AM, Bram Van Dam wrote:
> One possibility is to have separate core for each tenant domain.
&
queries are expressed and the results being returned.
-- Jack Krupansky
On Tue, Jan 6, 2015 at 3:39 AM, klunwebale wrote:
> hello
>
> i want to create a vertical search engine like trovit.com.
>
> I have installed solr and solarium.
>
> What else to i need can you recomme
You need to escape the space in your query (using backslash or quotes
around the term) - the query parser doesn't parse based on the
analyzer/tokenizer for each field.
-- Jack Krupansky
On Tue, Jan 6, 2015 at 4:05 AM, Sankalp Gupta
wrote:
> Hi
> I come across this weird behaviour i
t I agree that it would be highly desirable to push that 100 million
number up to 350 million or even 500 million ASAP since the pain of
unnecessarily sharding is unnecessarily excessive.
I wonder what changes will have to occur in Lucene, or... what evolution in
commodity hardware will be necessary t
ere.
So the race is on between when Lucene will relax the 2G limit and when
hardware gets fast enough that 2G documents can be indexed within a small
number of hours.
-- Jack Krupansky
On Sat, Jan 3, 2015 at 4:00 PM, Toke Eskildsen
wrote:
> Erick Erickson [erickerick...@gmail.com] wrote:
&
First, see if you can get your requirements to align to the de-dupe feature
that Solr already has:
https://cwiki.apache.org/confluence/display/solr/De-Duplication
-- Jack Krupansky
On Sat, Jan 3, 2015 at 2:54 AM, Amit Jha wrote:
> I am trying to find out duplicate records based on dista
R-839
-- Jack Krupansky
On Thu, Jan 1, 2015 at 4:08 AM, Leonid Bolshinsky
wrote:
> Hello,
>
> Are we always limited by the query parser syntax when passing a query
> string to Solr?
> What about the query elements which are not supported by the syntax?
> For example, BooleanQuery.setM
You would have to do your own build since the patch has not been committed.
-- Jack Krupansky
On Wed, Dec 31, 2014 at 12:27 AM, Rajesh wrote:
> Mikhail,
>
> How can I get a nightly build with fix for SOLR-5147 included. I've
> searched and found that nightly build will not be
I do have a more thorough discussion of WDF in my Solr Deep Dive e-book:
http://www.lulu.com/us/en/shop/jack-krupansky/solr-4x-deep-dive-early-access-release-7/ebook/product-21203548.html
You're not "wrong" about anything here... you just need to accept that WDF
is not magic a
Right, that's what I meant by WDF not being "magic" - you can configure it
to match any three out of four use cases as you choose, but there is no
choice that matches all of the use cases.
To be clear, this is not a "bug" in WDF, but simply a limitation.
-- Jack Krupan
a proof of
concept implementation to validate whether the sweet spot for your
particular data, data model, and application access patterns may be well
above or even below that.
Yes, indeed, sing praises for heroes, but don't kill yourself and drag down
others trying to be one yourself.
--
e absolute precision. Sometimes
you just want to know whether "something" exists matching the pattern, or
"generally" what the values look like.
I think it would be worth a Jira.
-- Jack Krupansky
On Tue, Dec 30, 2014 at 6:16 AM, Modassar Ather
wrote:
> Hi,
>
>
term and the multi-term phrase, while the query analyzer would NOT
do the split on case, so that the query could be a unitary term (possibly
with mixed case, but that would not split the term) or could be a two-word
phrase.
-- Jack Krupansky
-- Jack Krupansky
On Mon, Dec 29, 2014 at 5:12 PM
.
-- Jack Krupansky
-- Jack Krupansky
On Mon, Dec 29, 2014 at 12:54 PM, Erick Erickson
wrote:
> When you say 2B docs on a single Solr instance, are you talking only one
> shard?
> Because if you are, you're very close to the absolute upper limit of a
> shard, internally
> the doc
You can also use group.query or group.func to group documents matching a
query or unique values of a function query. For the latter you could
implement an NLP algorithm.
-- Jack Krupansky
On Sun, Dec 28, 2014 at 5:56 PM, Meraj A. Khan wrote:
> Thanks Aman, the thing is the bookName fi
are no longer I/O bound. If compute bound,
shard more heavily until the query latency becomes acceptable.
-- Jack Krupansky
On Fri, Dec 26, 2014 at 1:02 AM, Modassar Ather
wrote:
> Thanks for your suggestions Erick.
>
> This may be one of those situations where you really have to
&g
/solr/Exporting+Result+Sets
-- Jack Krupansky
On Fri, Dec 26, 2014 at 3:58 AM, Sandy Ding wrote:
> Hi, all
>
> I've recently set up a solr cluster and found that "export" returns
> different results from "select".
> And I confirmed that the "expor
ther it is Tomcat or Solr that gives the error, the main point is that
the raw circumflex shouldn't be sent to either.
-- Jack Krupansky
On Wed, Dec 24, 2014 at 4:32 PM, Erick Erickson
wrote:
> OK, then I don't think it's a Solr problem. I think 5 of your Tomcats are
> con
thing, but the real problem
is further upstream and hasn't been fully expressed. My model is to give you
a lot of examples and you can decide for yourself which best exemplifies
what you are trying to do. And to give more detail on the features of Solr.
-- Jack Krupansky
-Origina
My Solr Deep Dive e-book has full details and lots of examples for CSV
indexing:
http://www.lulu.com/us/en/shop/jack-krupansky/solr-4x-deep-dive-early-access-release-7/ebook/product-21203548.html
-- Jack Krupansky
-Original Message-
From: Alexandre Rafalovitch
Sent: Tuesday, December
boost as do less-precise phrases.
But it does need to be optional since it has an added cost at query time.
-- Jack Krupansky
-Original Message-
From: Michael Sokolov
Sent: Saturday, December 13, 2014 8:43 AM
To: solr-user@lucene.apache.org
Subject: Re: different fields for user-supplied
If possible, please post your field type for others to see the final
solution. Thanks!
-- Jack Krupansky
-Original Message-
From: Dinesh Babu
Sent: Wednesday, December 10, 2014 9:54 AM
To: solr-user@lucene.apache.org ; Ahmet Arslan
Subject: RE: How to stop Solr tokenising search
combined with the NGramFilterFactory and lower
case filter, but only use the ngram filter at index time.
See:
http://lucene.apache.org/core/4_10_2/analyzers-common/org/apache/lucene/analysis/ngram/NGramFilterFactory.html
But be aware that use of the ngram filter dramatically increases the index
to providing us with more specific requirements. My guess, from your
mention of LDAP, is that the field would contain only a name, but... that's
me guessing when you need to be specific. Once this distinction is cleared
up, we can then focus on solutions that work either for arbitrary text or
In particular, if they are image-intensive, all the images go away. And the
formatting as well.
-- Jack Krupansky
-Original Message-
From: Ahmet Arslan
Sent: Monday, December 1, 2014 6:02 PM
To: solr-user@lucene.apache.org
Subject: Re: Large fields storage
Hi Avi,
I assume your
of adopting for Solr.
I mean, are we trying too reinvent the wheel here, or what?!
Note: This is the Solr USER list, which isn't the best forum for development
discussions.
-- Jack Krupansky
-Original Message-
From: Erik Hatcher
Sent: Sunday, November 30, 2014 10
301 - 400 of 2693 matches
Mail list logo