can
execute them or if they require fanout to other shards and then aggregation
of results from those other shards.
-- Jack Krupansky
On Mon, Feb 8, 2016 at 11:24 AM, Erick Erickson
wrote:
> Short form: You really have to prototype. Here's the long form:
>
>
> https://lucidwo
And you're sure that you can't use the terms query parser, which was
explicitly designed for handling a very long list of terms to be implicitly
ORed?
-- Jack Krupansky
On Sat, Feb 6, 2016 at 2:26 PM, Salman Ansari
wrote:
> It looked like there was another issue with my query. I
://docs.mongodb.org/manual/reference/program/mongofiles/
-- Jack Krupansky
On Fri, Feb 5, 2016 at 3:13 PM, Arnett, Gabriel
wrote:
> Anyone have any experience indexing pdfs stored in binary form in mongodb?
>
> .
> Gabe Arnett
> Senior Dir
uot;, definitely not "quite long."
That said, the starting point for any data modeling effort is to look at
the full range of desired queries and that should drive the data model. So,
give us more info on queries, in terms of plain English descriptions of
what the user is trying to achieve.
.
Besides, the general goal is to avoid app clients talking directly to Solr
anyway.
-- Jack Krupansky
On Thu, Feb 4, 2016 at 2:57 AM, Derek Poh wrote:
> Hi Erick
>
> <<
> The manual way of doing this would be to construct an elaborate query,
> like q=spp_keyword_e
Yeah, that's exactly the kind of innocent user error that UIMA simply has
no code to detect and reasonably report.
-- Jack Krupansky
On Mon, Feb 1, 2016 at 12:13 PM, Gian Maria Ricci - aka Alkampfer <
alkamp...@nablasoft.com> wrote:
> It was a stupid error, I've mi
does
not exist.
-- Jack Krupansky
On Mon, Feb 1, 2016 at 10:18 AM, alkampfer wrote:
>
>
> From: outlook_288fbf38c031d...@outlook.com
> To: solr-user@lucene.apache.org
> Cc:
> Date: Mon, 1 Feb 2016 15:59:02 +0100
> Subject: Error configuring UIMA
>
> I've solv
Some people prefer to use Stack Overflow, but this mailing list is still
the definitive "forum" for Solr users.
See:
http://stackoverflow.com/questions/tagged/solr
-- Jack Krupansky
On Mon, Feb 1, 2016 at 10:58 AM, Shawn Heisey wrote:
> On 2/1/2016 1:13 AM, Jean-Jacques MONOT wr
At the bottom (the fine print!) it says: lineNumber: 15; columnNumber: 7;
The element type "meta" must be terminated by the matching end-tag
"".
-- Jack Krupansky
On Mon, Feb 1, 2016 at 10:45 AM, Gian Maria Ricci - aka Alkampfer <
alkamp...@nablasoft.com> wrote:
>
://cwiki.apache.org/confluence/display/solr/IndexConfig+in+SolrConfig
-- Jack Krupansky
On Sun, Jan 31, 2016 at 1:59 PM, abhi Abhishek wrote:
> Hi All,
> any suggestions/ ideas?
>
> Thanks,
> Abhishek
>
> On Tue, Jan 26, 2016 at 9:16 PM, abhi Abhishek
> wrote:
>
> >
Or try the terms query parser that lets you eliminate all the OR operators:
https://cwiki.apache.org/confluence/display/solr/Other+Parsers#OtherParsers-TermsQueryParser
-- Jack Krupansky
On Sun, Jan 31, 2016 at 9:23 AM, Paul Libbrecht wrote:
> How about using POST?
>
> paul
>
of 5GB. If you want to get a lot
above that, you're in uncharted territory. Besides, if you start pushing
your index well above the amount of available system memory your query
performance will suffer. I'd watch for the latter before pushing on the
former.
-- Jack Krupansky
On Sun, Jan
d not be possible with a limit
of only 15GB.
Maybe you could clue us in as to what effect you are trying to achieve. I
mean, why should any app care whether segments are 10GB or 15GB?
-- Jack Krupansky
On Sat, Jan 30, 2016 at 6:28 PM, Shawn Heisey wrote:
> On 1/30/2016 7:31 AM, Zheng Lin Edwin
have room to expand and handle spikes.
8. Run that final config for an extended period (days) with as realistic a
load as possible
9. If it too hits OOM or frequent GC, you may have to bump up the heap some
more, like another 10%.
-- Jack Krupansky
On Fri, Jan 29, 2016 at 11:51 AM, Erick Eri
block must be written to a new segment.
-- Jack Krupansky
On Fri, Jan 29, 2016 at 5:13 AM, Sathyakumar Seshachalam <
sathyakumar_seshacha...@trimble.com> wrote:
> Hi,
>
> Am trying to investigate the possibility of using Block Join query parser
> in a many-to-many
A simple boost query (bq) might do the trick, using edismax:
q=dvd bracket
bq=spp_keyword_exact:"dvd bracket"^100
qf=P_VeryShortDescription P_ShortDescription P_CatConcatKeyword
-- Jack Krupansky
On Thu, Jan 28, 2016 at 12:49 PM, Erick Erickson
wrote:
> bq: if you are interested
sing curl, please post the full curl command.
-- Jack Krupansky
On Thu, Jan 28, 2016 at 1:03 AM, diyun2008 wrote:
> The query is rather simple:
> http://127.0.0.1:8080/solr/collection1/select?q=title:#7654321*
>
>
>
>
> --
> View this message in context:
> http://l
would never be a
need to "re" score them. Are you simply looking for a way to shift/boost
the scores somehow? Again, tell us more about what you are actually trying
to achieve.
-- Jack Krupansky
On Thu, Jan 28, 2016 at 9:52 AM, vitaly bulgakov
wrote:
> I have Solr 4.2. Is it p
Just be to sure, please post the lines of code or command line that you are
using to issue the query.
-- Jack Krupansky
On Wed, Jan 27, 2016 at 10:50 PM, Yonik Seeley wrote:
> On Wed, Jan 27, 2016 at 10:47 PM, diyun2008 wrote:
> > Hi Yonik
> >
> >I do actually en
doc, which for Tiered is here:
http://lucene.apache.org/core/5_4_0/core/org/apache/lucene/index/TieredMergePolicy.html
I did doc all of these options (as of Solr 4.4) in my Solr 4.x Deep Dive
e-book and I don't think much of that has changed since then:
http://www.lulu.com/us/en/shop/jack-krupans
What exacting are you merge policy settings in solrconfig? They control
when the background merges will be performed. Sometimes they do need to be
tweaked.
-- Jack Krupansky
On Mon, Jan 25, 2016 at 1:50 PM, James Mason
wrote:
> Hi,
>
> I’ve have a large index that has been adde
Just escape them with a backslash. Or put each term in quotes.
-- Jack Krupansky
On Sun, Jan 24, 2016 at 5:21 AM, Jian Mou wrote:
> Hi Jack,
>
> Thanks! Do you know how to disable wildcards, What I want is if input is
> wildcards, just treat it as a normal char. I other words,
>
ll as HA availability requirements.
-- Jack Krupansky
On Fri, Jan 22, 2016 at 5:45 PM, Toke Eskildsen
wrote:
> Aswath Srinivasan (TMS) wrote:
> > * Totally about 2.5 million documents to be indexed
> > * Documents average size is 512 KB - pdfs and htmls
>
> &g
To be clear, having separate Solr servers on different versions should
definitely not be a problem. The only potential difficulty here is the
SolrJ vs. server back-compat issue.
-- Jack Krupansky
On Fri, Jan 22, 2016 at 10:57 AM,
wrote:
> Shawn wrote:
> >
> > If you are NOT ru
nts aren't using
any new features there would be a reasonable expectation that they should
continue to work.
-- Jack Krupansky
On Fri, Jan 22, 2016 at 10:40 AM,
wrote:
> Yeah, sort of. Solr isn't bundled in the CMS, it is in a separate Tomcat
> instance. But our code is running
), the app
should work fine. So... if you stick with SolrJ 4 and use the Solr 4 doc as
your guide, you should be okay. That's the theory.
Worst case, you would have to deploy a Solr 4 server. That's not the
preferred choice, but is a decent backup plan.
-- Jack Krupansky
On Fri, Jan 22, 201
Just to be clear, are you talking about a single app that does SolrJ calls
to both your CMS and your free text search index? So, one Java app that is
simultaneously sending requests to two Solr instances (once 4, one 5)?
-- Jack Krupansky
On Fri, Jan 22, 2016 at 1:57 AM,
wrote:
> Hi,
>
complex wildcard is used - should an
exception be thrown, or... what?
I suppose it might be simplest to have a Solr option to limit the number of
wildcard characters used in a term, like to 4 or 8 or something like that.
IOW, have Solr check the term before the WildcardQuery is generated.
-- Jack
issue for Solr. The only issue there is
assuring that you have enough Solr shards and replicas to handle the
aggregate request load.
-- Jack Krupansky
On Thu, Jan 21, 2016 at 6:37 AM, Gian Maria Ricci - aka Alkampfer <
alkamp...@nablasoft.com> wrote:
> Hi,
>
>
>
> I’ve
te the doc for this stored field
restriction, right?!)
-- Jack Krupansky
On Wed, Jan 20, 2016 at 9:38 AM, Joel Bernstein wrote:
> CloudSolrStream is available in Solr 5. The "search" streaming expression
> can used or CloudSolrStream can be used in directly.
>
> https://cwi
ients
that automatically send requests to all the shards in a collection (or
multiple collections) and then merge the sorted sets any way they wish."
-- Jack Krupansky
On Wed, Jan 20, 2016 at 8:41 AM, Susheel Kumar
wrote:
> Hello Salman,
>
> Please checkout the export fu
ogether.*" They must also be updated together.
-- Jack Krupansky
On Fri, Jan 15, 2016 at 3:31 AM, Mikhail Khludnev <
mkhlud...@griddynamics.com> wrote:
> On Thu, Jan 14, 2016 at 10:01 PM, sairamkumar <
> sairam.subraman...@gmail.com>
> wrote:
>
> > This is a
.
Plenty of doc for you to start reading. Once you get the basics, then you
can move on to more specific and advanced details:
https://cwiki.apache.org/confluence/display/solr/Understanding+Analyzers%2C+Tokenizers%2C+and+Filters
-- Jack Krupansky
On Fri, Jan 15, 2016 at 2:58 PM, sara hajili
the entire index. If you actually don't need minimal latency, then of
course you can feel free to trade off RAM for lower latency.
-- Jack Krupansky
On Fri, Jan 15, 2016 at 4:43 AM, Gian Maria Ricci - aka Alkampfer <
alkamp...@nablasoft.com> wrote:
> Hi,
>
>
>
> When it
shard, let alone all shards. Should
backups be collection-based as well?
-- Jack Krupansky
On Fri, Jan 15, 2016 at 3:26 AM, Gian Maria Ricci - aka Alkampfer <
alkamp...@nablasoft.com> wrote:
> Yes, I've checked that jira some weeks ago and it is the reason why I was
> telling
ng parsing of the query) to send the request to exactly the node
(or replica) that owns that token/ID.
But if you really just trying to "query by ID", that should really have a
nice clean API so you don't have to build query syntax.
-- Jack Krupansky
On Thu, Jan 14, 2016 at 8:41 P
although even that should not be a big problem.
And make sure the ID field is string or numeric, not tokenized text.
-- Jack Krupansky
On Thu, Jan 14, 2016 at 7:53 PM, Shawn Heisey wrote:
> On 1/14/2016 5:20 PM, Shivaji Dutta wrote:
> > I am working with a customer that has abou
Which release of Solr are you using? Last year (or so) there was a Lucene
change that had the effect of keeping all terms for WDF at the same
position. There was also some discussion about whether this was either a
bug or a bug fix, but I don't recall any resolution.
-- Jack Krupansky
O
That sounds like it. Sorry my memory is so hazy.
Maybe Yonik can either confirm that that Jira is still outstanding or close
it, and confirm if these symptoms are related.
-- Jack Krupansky
On Thu, Jan 14, 2016 at 10:54 AM, Erick Erickson
wrote:
> Jack:
>
> I think that was for facet
t" indicates success or
"Exception while creating snapshot" indicates failure. If only that first
message appeals, it means the backup is still in progress.
-- Jack Krupansky
On Thu, Jan 14, 2016 at 9:23 AM, Gian Maria Ricci - aka Alkampfer <
alkamp...@nablasoft.com> wro
I recall a couple of previous discussions regarding some sort of
filter/field cache change in Lucene where they removed what had been an
optimization for Solr.
-- Jack Krupansky
On Wed, Jan 13, 2016 at 8:10 PM, Erick Erickson
wrote:
> It's quite surprising that you're getting
e considered a fresh
new distributed Solr deployment with anything other than SolrCloud.
(Hmmm... have any of the committers considered deprecating the old
non-SolrCloud distributed mode features?)
-- Jack Krupansky
On Wed, Jan 13, 2016 at 9:02 AM, Shivaji Dutta
wrote:
> - SolrCloud uses
and invest
significant effort in a custom request handler when simpler techniques may
suffice.
-- Jack Krupansky
On Sat, Jan 9, 2016 at 12:08 PM, Ahmet Arslan
wrote:
> Hi Mark,
>
> Yes this is possible. Better, you can use a custom SearchComponent for
> this task too.
> You retri
ption.*"
So that's a second reason - to avoid the max clause count limitation of
Boolean Query.
See:
https://lucene.apache.org/core/5_4_0/core/org/apache/lucene/search/MultiTermQuery.html#CONSTANT_SCORE_REWRITE
https://lucene.apache.org/core/5_4_0/core/org/apache/lucene/search/WildcardQuery
ork, be sure to provide detail of what the symptom
is rather than simply saying that it doesn't work.
-- Jack Krupansky
On Wed, Jan 6, 2016 at 8:43 AM, marotosg wrote:
> Hi,
>
> I am trying to add a new field to my schema to add the number of items of a
> multivalued field.
&g
://www.elastic.co/guide/en/elasticsearch/reference/current/search-percolate.html
-- Jack Krupansky
On Tue, Jan 5, 2016 at 11:05 AM, Allison, Timothy B.
wrote:
> Might want to look into:
>
> https://github.com/flaxsearch/luwak
>
> or
> https://github.com/OpenSextant/Solr
ctory should contain a solr.xml file,
unless solr.xml exists in ZooKeeper. The default value is server/solr.
"
https://cwiki.apache.org/confluence/display/solr/Solr+Start+Script+Reference
-- Jack Krupansky
On Mon, Jan 4, 2016 at 10:28 AM, Mugeesh Husain wrote:
> you could start solr
need function queries there as well.
-- Jack Krupansky
On Thu, Dec 31, 2015 at 6:50 PM, William Bell wrote:
> We are getting weird results with if(exists(a),b,c). We are getting b+c!!
>
>
> http://localhost:8983/solr/providersearch/select?q=*:*&wt=json&state=state:%22CO%22&stat
Is the field multivalued?
-- Jack Krupansky
On Sun, Dec 27, 2015 at 11:16 PM, Jamie Johnson wrote:
> What is the difference of adding a field with the same value twice or
> adding it once and boosting the field on add? Is there a situation where
> one approach is preferred?
>
> Jamie
>
abase. Was someone telling you something different?
-- Jack Krupansky
On Mon, Dec 28, 2015 at 1:48 PM, Salman Ansari
wrote:
> Hi,
>
> I am facing an issue where I need to change Solr schema but I have crucial
> data that I don't want to delete. Is there a way where I can chan
itself (other than raw JMX and ping.)
-- Jack Krupansky
On Wed, Dec 23, 2015 at 6:27 AM, Emir Arnautovic <
emir.arnauto...@sematext.com> wrote:
> Hi Shail,
> As William mentioned, our SPM <https://sematext.com/spm/index.html>
> allows you to monitor all main Solr/Jvm/Host me
,
making more copies of the index that can each be searched in parallel.
How long do queries take when the site is operating normally?
Make sure that you have enough system memory to cache the index, otherwise
the machine wish be thrashing with lots of I/O for competing requests.
-- Jack Krupansky
On
formance is consumed when you have a lot of fields which are
not present for a particular data source.
-- Jack Krupansky
On Tue, Dec 22, 2015 at 11:25 AM, Susheel Kumar
wrote:
> Hello,
>
> I am going thru few use cases where we have kind of multiple disparate data
> sources which in
the exact practical limit
depends on your particular hardware and your particular data model and the
data itself.
How large is each document, roughly? Hundreds, thousands, or millions of
bytes? Are some documents extremely large?
-- Jack Krupansky
On Fri, Dec 18, 2015 at 10:30 AM, Toke Eskild
or to return a large bulk of documents?
-- Jack Krupansky
On Thu, Dec 17, 2015 at 7:01 AM, Modassar Ather
wrote:
> Hi,
>
> I have a field f which is defined as follows.
> omitNorms="true"/>
>
> Solr-5.2.1 is used. The index is spread across 12 shards (no replic
update
has various caveats so that it is only useful in a subset of use cases.
-- Jack Krupansky
On Wed, Dec 16, 2015 at 10:09 AM, Jamie Johnson wrote:
> I have a use case where we only need to append some fields to a document.
> To retrieve the full representation is very expensive but I can
There is no HA with a single replica for each shard. Replication factor
must be at least 2 for HA.
-- Jack Krupansky
On Wed, Dec 16, 2015 at 12:38 AM, Peter Tan wrote:
> Hi Jack, What happens when there is only one replica setup?
>
> On Tue, Dec 15, 2015 at 9:32 PM, Jack Krupansky
Solr Cloud provides HA when you configure at least two replicas for each
shard and have at least 3 zookeepers. That's it. No deck or detail document
is needed.
-- Jack Krupansky
On Tue, Dec 15, 2015 at 9:07 PM, wrote:
> Hi Team,
>
> Can you help me in understanding in achieving
ink of
the company as being named "Apple Computer" even though they dropped
"Computer" from the name back in 2007. Also, it is "Inc.", not "Company",
so a proper search would be for "Apple Inc." or the old "Apple Computer,
Inc."
-- Jack Kr
same things as well.
-- Jack Krupansky
On Tue, Dec 15, 2015 at 2:42 PM, Chris Hostetter
wrote:
>
> : Sweetspot does require reindexing but is that the only one? I have not
> : investigated some exotic implementations, anyone to confirm sweetspot is
> : the only one? In that case you
You would need to define an alternate field which copied a base field but
then had the desired alternate similarity, using SchemaSimilarityFactory.
See:
https://cwiki.apache.org/confluence/display/solr/Other+Schema+Elements
-- Jack Krupansky
On Tue, Dec 15, 2015 at 10:02 AM, Dmitry Kan wrote
and then index the raw text.
-- Jack Krupansky
On Mon, Dec 14, 2015 at 12:04 PM, Antelmo Aguilar wrote:
> Hello,
>
> I am trying to index a very large file in Solr (around 5GB). However, I
> get out of memory errors using Curl. I tried using the post script and I
> had some
in a separate table (use the same partition key to assure
that the join will be more efficient by being on the same node.)
-- Jack Krupansky
On Fri, Dec 11, 2015 at 6:21 AM, Andrea Gazzarini
wrote:
> Hi Vikram,
> sounds like you're using those "dynamic" fields only for visua
You can also use Solr Cell to send entire PDF or office documents:
https://cwiki.apache.org/confluence/display/solr/Uploading+Data+with+Solr+Cell+using+Apache+Tika
-- Jack Krupansky
On Wed, Dec 9, 2015 at 3:09 AM, subinalex wrote:
> Hi,
>
> I am a solr newbie,just got a quick
monly.
And, yes, each app has its own latency requirements. The purpose of a
general rule is to generally avoid unhappiness, but if you have an appetite
and tolerance for unhappiness, then go for it.
Replica vs. shard? They're basically the same - a replica is a copy of a
shard.
-- Jack Kr
constantly
re-read portions of the index into memory.
The practical limit for documents is not per core or number of cores but
across all cores on the node since it is mostly a memory limit and the
available CPU resources for accessing that memory.
-- Jack Krupansky
On Tue, Dec 8, 2015 at 8:57 AM
Never made it into CHANGES.txt either. Not part of any patch either.
Appears to have been secretly committed as a part of SOLR-6787 (Blob API) via
Revision *1650448
<http://svn.apache.org/viewvc?view=revision&revision=1650448>* in Solr 5.1.
-- Jack Krupansky
On Fri, Dec 4, 2015 a
recall (even the most remote partial match to avoid missing
any documents) with a much higher boost for exact matches.
-- Jack Krupansky
On Tue, Dec 1, 2015 at 10:10 AM, Erik Hatcher
wrote:
> One technique that works well is to use copyField to end up with two
> indexed fields, on
The mm parameter or default operator logic only applies to the top level of
the query. Once you get nested in parentheses below the top level,
Solr/Lucene reverts to the default of the OR (SHOULD) operator.
-- Jack Krupansky
On Mon, Nov 30, 2015 at 5:45 AM, Modassar Ather
wrote:
> Hi,
&g
Yeah, this stuff is poorly documented, not very intuitive, and the
terminology is poorly designed in the first place, so it's completely
expected to easily get confused by it. Not even a mention of it in the Solr
reference guide.
-- Jack Krupansky
On Wed, Nov 25, 2015 at 4:39 AM, Aless
x27;m not sure how useful it will be.
-- Jack Krupansky
On Tue, Nov 24, 2015 at 4:06 AM, Manohar Sripada
wrote:
> I have a requirement where I need to be able to query on a field (say
> "salary"). This field contains data in Chinese.
>
> Is it possible in Solr to do a ra
The primary recommendation is that you flatten nested documents.
That means one Solr document per cpc, not multivalued.
As always, queries should drive your data model, so please specify what a
typical query might be like, in plain English.
-- Jack Krupansky
On Tue, Nov 24, 2015 at 4:39 AM
IDs in use during a particular
interval of time?
-- Jack Krupansky
On Fri, Nov 20, 2015 at 4:50 PM, jichi wrote:
> Hi,
>
> I am using Solr 4.7.0 to search text with an id filter, like this:
>
> id:(100 OR 2 OR 5 OR 81 OR 10 ...)
>
> The number of IDs in the boolean fi
Do the failing IDs have any special characters that might need to be
escaped?
Can you find the documents using a normal query on the unique key field?
-- Jack Krupansky
On Thu, Nov 19, 2015 at 10:27 AM, Jérémie MONSINJON <
jeremie.monsin...@gmail.com> wrote:
> Hello everyone !
>
per shard. But be aware
that a query for the sharded version will be slower than for a single-shard
implementation.
-- Jack Krupansky
On Wed, Nov 18, 2015 at 11:02 PM, Troy Edwards
wrote:
> I am looking for some good articles/guidance on how to determine number of
> shards and replicas
Use an index-time (but not query time) synonym filter with a rule like:
Abd Allah,Abdallah
This will index the combined word in addition to the separate words.
-- Jack Krupansky
On Mon, Nov 9, 2015 at 4:48 AM, Mahmoud Almokadem
wrote:
> Hello,
>
> We are indexing Arabic content and
if that is a lot faster, both with old and new Solr.
-- Jack Krupansky
On Fri, Nov 6, 2015 at 3:01 PM, wei wrote:
> Thanks Jack and Shawn. I checked these Jira tickets, but I am not sure if
> the slowness of MatchAllDocsQuery is also caused by the removal of
> fieldcache. Can someo
I vaguely recall some discussion concerning removal of the field cache in
Lucene.
-- Jack Krupansky
On Thu, Nov 5, 2015 at 10:38 PM, wei wrote:
> We are running our search on solr4.7 and I am evaluating whether to upgrade
> to solr5.3.1. I found MatchAllDocsQuery is much slower in sol
Great. Now, we'll have to see if any enterprising committers will step up
and take a look.
-- Jack Krupansky
On Thu, Nov 5, 2015 at 4:46 AM, Mahmoud Almokadem
wrote:
> Thanks Jack. I have reported it as a bug on JIRA
>
> https://issues.apache.org/jira/browse/SOLR-
ittle outdated (since 4.4) and even then
was not complete (no SolrCloud or DIH), but its table of contents would
probably give you a fair view of the sheer magnitude of the number of Solr
features:
http://www.lulu.com/shop/jack-krupansky/solr-4x-deep-dive-early-access-release-7/ebook/product-21203548
I think you should go ahead and file a Jira ticket for this as a bug since
either it is an actual bug or some behavior nuance that needs to be
documented better.
-- Jack Krupansky
On Wed, Nov 4, 2015 at 8:24 AM, Mahmoud Almokadem
wrote:
> I removed the q.op=“AND” and add the mm=2
>
top-level parentheses is causing the query parser logic to act
as if the parentheses were not there.
You neglected to give us your qf parameter, but obviously it is:
qf=Title^200.0 TotalField, I think.
-- Jack Krupansky
On Wed, Nov 4, 2015 at 3:39 AM, Mahmoud Almokadem
wrote:
> Hello,
>
Did you index the data before adding the word delimiter filter? The white
space tokenizer preserves the period after "stocks.", but the WDF should
remove it. The period is likely interfering with stemming.
Are your filters the same for index time and query time?
-- Jack Krupansky
On T
Are you trying to do an atomic update without the content field? If so, it
sounds like Solr needs an enhancement (bug fix?) so that language detection
would be skipped if the input field is not present. Or maybe that could be
an option.
-- Jack Krupansky
On Thu, Oct 29, 2015 at 3:25 AM, Chaushu
Each instance should be installed in a separate directory. IOW, don't try
running multiple Solr processes for the same data.
-- Jack Krupansky
On Mon, Oct 26, 2015 at 1:33 PM, Steven White wrote:
> Hi,
>
> For reasons I have no control over, I'm required to run 2 (maybe m
o use
Solr in a way other than it was intended.
-- Jack Krupansky
On Sat, Oct 24, 2015 at 11:13 AM, Aki Balogh wrote:
> Gotcha - that's disheartening.
>
> One idea: when I run termfreq, I get all of the termfreqs for each document
> one-by-one.
>
> Is there a way to h
about your usage?
Generally, moderate use of a feature is much more advisable to heavy usage,
unless you don't care about performance.
-- Jack Krupansky
On Fri, Oct 23, 2015 at 8:19 AM, Aki Balogh wrote:
> Hello,
>
> In our solr application, we use a Function Query (termfreq) very
only we know what your problem really was.
-- Jack Krupansky
On Thu, Oct 22, 2015 at 11:18 AM, Roxana Danger <
roxana.dan...@reedonline.co.uk> wrote:
> Hi Erik,
>
> Thanks for the links, but the analyzers are called correctly. The problem
> is that I need to get access to the
I checked the code and the limit is actually 5MB and configurable via
the blob.max.size.mb config property. I posted a comment on the Solr doc
for this.
In any case, thanks for sharing info that you gleaned from the conference,
for all of us who couldn't make it.
-- Jack Krupansky
On Tue
ard to specify the common prefix for the files.
-- Jack Krupansky
On Tue, Oct 20, 2015 at 8:19 AM, Shalin Shekhar Mangar <
shalinman...@gmail.com> wrote:
> No, the maximum size is limited to 2MB for now. The use-case behind
> the blob store is to store small jars (custom plugins) and s
networking.
In any case, keep those user reports flowing. I'm sure there are plenty of
people who didn't make it to the conference.
-- Jack Krupansky
On Sun, Oct 18, 2015 at 8:52 AM, Erik Hatcher
wrote:
> The Revolution was not televised (though heavily tweeted, and videos
ctual
algorithm, but the effect for a bunch of the common use cases.
-- Jack Krupansky
On Sun, Oct 18, 2015 at 10:18 AM, Erick Erickson
wrote:
> On the surface this seems like something of a distraction.
>
> 10M docs x 100 values/docs = 1B integers. Assuming all
> need to be held in m
initial query comes up empty, then you could move on to the next
highest most likely field, maybe product title (short one line
description), and query voluminous fields like detailed product
descriptions, specifications, and user comments/reviews only as a last
resort.
-- Jack Krupansky
On Tue
actual product names or important
keywords rather than random words from the English language that happen to
occur in descriptions, all of which would occur in a catchall.
-- Jack Krupansky
On Mon, Oct 12, 2015 at 8:39 AM, elisabeth benoit wrote:
> Hello,
>
> We're using solr 4.10 a
uot;. Including specific examples.
-- Jack Krupansky
On Fri, Oct 2, 2015 at 9:33 AM, remi tassing wrote:
> Hi,
> I have medium-low experience on Solr and I have a question I couldn't quite
> solve yet.
>
> Typically we have quite short query strings (a couple of words) and th
same machine as the Lucene/Solr index directory.
-- Jack Krupansky
On Fri, Oct 2, 2015 at 7:42 AM, Mark Fenbers wrote:
> Thanks for the suggestion, but I've looked at aspell and hunspell and
> neither provide a native Java API. Further, I already use Solr for a
> search engine, to
nalyzed as if it were simple
text.
-- Jack Krupansky
On Wed, Sep 30, 2015 at 9:32 AM, anil.vadhavane
wrote:
> Hi Benedetti,
>
> Yes, at first it looks like a user error and I am surprised as well with
> the
> case.
>
> We tested this on two different system. We tried it wi
/flexible/standard/StandardQueryParser.html
-- Jack Krupansky
On Mon, Sep 21, 2015 at 6:57 AM, Jack Krupansky
wrote:
> Probably a reference to the so-called flex query parser:
>
> https://lucene.apache.org/core/4_10_0/queryparser/org/apache/lucene/queryparser/flexible
-summary.html
The original Jira:
https://issues.apache.org/jira/browse/LUCENE-1567
This new query parser was dumped into Lucene some years ago, but I haven't
noticed any real activity or interest in it.
-- Jack Krupansky
On Mon, Sep 21, 2015 at 6:36 AM, Dmitry Kan wrote:
> Hello!
>
An update request processor is a preferred approach - take the source
value, split it, and create separate source values for each of the
associated fields.
-- Jack Krupansky
On Wed, Sep 9, 2015 at 3:30 AM, Roxana Danger <
roxana.dan...@reedonline.co.uk> wrote:
> Hello,
> I have
101 - 200 of 2693 matches
Mail list logo