Attached patch into the JIRA issue.
Reviews are welcome.
On Thu, Dec 19, 2013 at 7:24 PM, Isaac Hebsh isaac.he...@gmail.com wrote:
Roman, do you have any results?
created SOLR-5561
Robert, if I'm wrong, you are welcome to close that issue.
On Mon, Dec 9, 2013 at 10:50 PM, Isaac Hebsh
created SOLR-5560
On Tue, Dec 10, 2013 at 8:48 AM, William Bell billnb...@gmail.com wrote:
Sounds like a bug.
On Mon, Dec 9, 2013 at 1:16 PM, Isaac Hebsh isaac.he...@gmail.com wrote:
If so, can someone suggest how a query should be escaped (securely and
correctly)?
Should I escape
Roman, do you have any results?
created SOLR-5561
Robert, if I'm wrong, you are welcome to close that issue.
On Mon, Dec 9, 2013 at 10:50 PM, Isaac Hebsh isaac.he...@gmail.com wrote:
You can see the norm value, in the explain text, when setting
debugQuery=true.
If the same item gets
Hi Robert and Manuel.
The DefaultSimilarity indeed sets discountOverlap to true by default.
BUT, the *factory*, aka DefaultSimilarityFactory, when called by
IndexSchema (the getSimilarity method), explicitly sets this value to the
value of its corresponding class member.
This class member is
If so, can someone suggest how a query should be escaped (securely and
correctly)?
Should I escape the quote mark (and backslash mark itself) only?
On Fri, Dec 6, 2013 at 2:59 PM, Isaac Hebsh isaac.he...@gmail.com wrote:
Obviously, there is the option of external parameter ({...
v=$nestedq
created SOLR-5542.
Anyone else want it?
On Thu, Dec 5, 2013 at 8:55 PM, Isaac Hebsh isaac.he...@gmail.com wrote:
Hi,
It seems that a facet query does not use the global query parameters (for
example, field aliasing for edismax parser).
We have an intensive use of facet queries (in some
, Isaac Hebsh
isaac.he...@gmail.comjavascript:;
wrote:
Hi Robert and Manuel.
The DefaultSimilarity indeed sets discountOverlap to true by default.
BUT, the *factory*, aka DefaultSimilarityFactory, when called by
IndexSchema (the getSimilarity method), explicitly sets this value
like its
broken.
On Thu, Dec 5, 2013 at 1:53 PM, Isaac Hebsh isaac.he...@gmail.com wrote:
Hi,
we implemented a morphologic analyzer, which stems words on index time.
For some reasons, we index both the original word and the stem (on the
same
position, of course).
The stemming is done
We want to set a LocalParam on a nested query. When quering with v inline
parameter, it works fine:
http://localhost:8983/solr/collection1/select?debugQuery=truedefType=lucenedf=idq=TERM1AND
{!lucene df=text v=TERM2 TERM3 \TERM4 TERM5\}
the parsedquery_toString is
+id:TERM1 +(text:term2
Obviously, there is the option of external parameter ({...
v=$nestedq}nestedq=...)
This is a good solution, but it is not practical, when having a lot of such
nested queries.
Any ideas?
On Friday, December 6, 2013, Isaac Hebsh wrote:
We want to set a LocalParam on a nested query. When quering
Hi,
we implemented a morphologic analyzer, which stems words on index time.
For some reasons, we index both the original word and the stem (on the same
position, of course).
The stemming is done on a specific language, so other languages are not
stemmed at all.
Because of that, two documents with
Hi,
It seems that a facet query does not use the global query parameters (for
example, field aliasing for edismax parser).
We have an intensive use of facet queries (in some cases, we have a lot of
facet.query for a single q), and the using of LocalParams for each
facet.query is not convenient.
, Ahmet Arslan iori...@yahoo.com wrote:
Hi Isaac,
Did you consider omitting norms completely for that field? omitNorms=true
Are you using solr.RemoveDuplicatesTokenFilterFactory?
On Thursday, December 5, 2013 8:55 PM, Isaac Hebsh isaac.he...@gmail.com
wrote:
Hi,
we implemented a morphologic
Hi,
Try using facet.query on each part, you will get the number of total hits
for every OR.
If you need this info per document, the answers might appear when
specifying debug query=true.. If that info is useful, try adding
[explain] to fl param (probably requires registering the augmenter plugin
Hi Dmitry,
I'm trying to examine your suggestion to create a frontend node. It sounds
pretty usefull.
I saw that every node in solr cluster can serve request for any collection,
even if it does not hold a core of that collection. because of that, I
thought that adding a new node to the cluster
for reading the index, or more
CPUs because the merging process might be more CPU intensive).
Isn't it possible?
On Wed, Oct 2, 2013 at 12:42 AM, Shawn Heisey s...@elyograg.org wrote:
On 10/1/2013 2:35 PM, Isaac Hebsh wrote:
Hi Dmitry,
I'm trying to examine your suggestion to create a frontend
Hi,
Trying to solve query performance issue, we suspect on the number of index
segments, which might slow the query (due to I/O seeks, happens for each
term in the query, multiplied by number of segments).
We are on Solr 4.3 (TieredMergePolicy with mergeFactor of 4).
We can reduce the number of
Hi Greg, Did you get an answer?
I'm interested in the same question.
More generally, what are the benefits of HdfsDirectoryFactory, besides the
transparent restore of the shard contents in case of a disk failure, and
the ability to rebuild index using MR?
Is the next statement exact? blocks of a
/SOLR-5053
What would you do?
On Tue, Sep 17, 2013 at 10:31 PM, Isaac Hebsh isaac.he...@gmail.com wrote:
Hi everyone,
We developed a TokenFilter.
It should act differently, depends on a parameter supplied in the
query (for query chain only, not the index one, of course).
We found no way
Hi everyone,
We developed a TokenFilter.
It should act differently, depends on a parameter supplied in the
query (for query chain only, not the index one, of course).
We found no way to pass that parameter into the TokenFilter flow. I guess
that the root cause is because TokenFilter is a pure
Hi,
We've investigated a memory dump, which was taken after some frequent OOM
incidents.
The main issue we found was a lot of millions of LazyField instances,
taking ~2GB of memory, even though queries request about 10 small fields
only.
We've found that LazyDocument creates a LazyField object
Thanks Hoss.
1. We currently use Solr 4.3.0.
2. I understand this architecture of LazyFields, but i did not understand
why multiple LazyFields should be created for the multivalued field. You
can't load a part of them. If you request the field, you will get ALL of
its values. so 100 (or more)
Thanks to Ryan Ernst, my issue is duplicate of SOLR-4449.
I think that this proposal might be very useful (some supporting links are
attached there. worth reading..)
On Tue, Jul 30, 2013 at 11:49 PM, Isaac Hebsh isaac.he...@gmail.com wrote:
Hi,
I submitted a new JIRA for this:
https
_why_
you so often have a slow shard and whether the problem could
be cured with, say, better warming queries on the shards...
Best
Erick
On Fri, Jul 26, 2013 at 8:23 AM, Isaac Hebsh isaac.he...@gmail.com
wrote:
Hi!
When SolrClound executes a query, it creates shard requests, which
or not.. :)
On Sun, Jul 28, 2013 at 1:06 AM, Shawn Heisey s...@elyograg.org wrote:
On 7/27/2013 3:33 PM, Isaac Hebsh wrote:
I have about 40 shards. repFactor=2.
The cause of slower shards is very interesting, and this is the main
approach we took.
Note that in every query, it is another shard
Hi!
When SolrClound executes a query, it creates shard requests, which is sent
to one replica of each shard. Total QTime is determined by the slowest
shard response (plus some extra time). [For simplicity, let's assume that
no stored fields are requested.]
I suffer from a situation where in
Hi,
There was a thread about viewing Solr Wiki offline, About 6 months ago. I'm
intersted, too.
It seems that a manual (cron?) dump will do the work...
Would it be too much to ask that one of the admins will manually create
such a dump? (http://moinmo.in/HelpOnMoinCommand/ExportDump)
Otis, is
.
You could try with higher solr versions too. If it does not work, please
lets us know.
https://issues.apache.org/jira/secure/attachment/12579832/ComplexPhrase-4.2.1.zip
From: Isaac Hebsh isaac.he...@gmail.com
To: solr-user@lucene.apache.org
Sent: Saturday
wanted
to use these for production.
I confess I don't know what state they were left in or why they were
never committed.
FWIW,
Erick
On Wed, Jun 19, 2013 at 10:08 AM, Isaac Hebsh isaac.he...@gmail.com
wrote:
Hi,
I'm trying to understand what is the status of enabling wildcards
Hi,
I'm trying to understand what is the status of enabling wildcards on phrase
queries?
Lucene JIRA issue: https://issues.apache.org/jira/browse/LUCENE-1486
Solr JIRA issue: https://issues.apache.org/jira/browse/SOLR-1604
It looks like these issues are not going to be solved in the close
Hi everyone,
My SolrCloud cluster (4.3.0) has came into production a few days ago.
Docs are being indexed into Solr using /update requestHandler, as a POST
request, containing text/xml content-type.
The collection is sharded into 36 pieces, each shard has two replicas.
There are 36 nodes (each
28, 2013 at 7:08 AM, Isaac Hebsh isaac.he...@gmail.com wrote:
I don't want to affect on the (correctness of the) real query parsing, so
creating a QParserPlugin is risky.
Instead, If I'll parse the query in my search component, it will be
detached from the real query parsing, (obviously
Hi.
Searching terms with wildcard in their start, is solved with
ReversedWildcardFilterFactory. But, what about terms with wildcard in both
start AND end?
This query is heavy, and I want to disallow such queries from my users.
I'm looking for a way to cause these queries to fail.
I guess there
this way, you are changing semantics - but don't need to touch the syntax
definition; of course, you may also change the grammar and allow only one
instance of wildcard (or some combination) but for that you should probably
use LUCENE-5014
roman
On Mon, May 27, 2013 at 2:18 PM, Isaac Hebsh
,
or just above, the wildcard processor
also make sure you are setting your qparser for FQ queries, ie.
fq={!nw}foo
On Mon, May 27, 2013 at 5:01 PM, Isaac Hebsh isaac.he...@gmail.com
wrote:
Thanks Roman.
Based on some of your suggestions, will the steps below do the work?
* Create
:38 , Isaac Hebsh wrote:
Hi,
I'm trying to use Surround Query Parser for two reasons, which are not
covered by proximity slops:
1. find documents with two words within a given distance, *unordered*
2. given two lists of words, find documents with (at least) one word from
list A and (at least
Hi everyone..
I'm indexing docs into Solr using the update request handler, by POSTing
data to the REST endpoint (not SolrJ, not DIH).
My indexer should return an indication, whether the document existed in the
collection before or not, based in its ID.
The obvious solution is the perform a
Hi,
I'm trying to use Surround Query Parser for two reasons, which are not
covered by proximity slops:
1. find documents with two words within a given distance, *unordered*
2. given two lists of words, find documents with (at least) one word from
list A and (at least) one word from list B, within
Hi Tim,
Are you running Solr 4.2? (In 4.0 and 4.1, the Collections API didn't
return any failure message. see SOLR-4043 issue).
As far as I know, you can't tell Solr to use authentication credentials
when communicating other nodes. It's a bigger issue.. for example, if you
want to protect the
Let's say you have machine A and machine B. you want to shutdown B.
If all the shards on B have replicas (on A), you can shutdown B instantly.
If there is a shard on B that has no replica, you should create one on
machine A (using Core API), let it replicate the whole shard contents, and
then you
Hi,
The example schema.xml in Solr 4.2 does not define id field
as docValues=true.
Any good reason? (other than backward compat for index for previous
version...)
If my common case is fl=id (and no other field), DocValues is classic for
me. Am I right?
Hi,
I'm trying to monitor some Solr behaviour, using JMX.
It looks like a great job was done there, but I can't find any
documentation on the MBeans themselves.
For example, DirectUpdateHandler2 attributes. What is the difference
between adds and cumulative_adds? Is adds count the last X seconds
).
This solution exactly covers my case. Thank you!
On Wed, Feb 20, 2013 at 11:33 PM, Isaac Hebsh isaac.he...@gmail.com wrote:
Nobody responded my JIRA issue :(
Should I commit this patch into SVN's trunk, and set the issue as Resolved?
On Sun, Feb 17, 2013 at 9:26 PM, Isaac Hebsh isaac.he
Hi.
I add documents to Solr by POSTing them to UpdateHandler, as bulks of add
commands (DIH is not used).
If one document contains any invalid data (e.g. string data into numeric
field), Solr returns HTTP 400 Bad Request, and the whole bulk is failed.
I'm searching for a way to tell Solr to
Nobody responded my JIRA issue :(
Should I commit this patch into SVN's trunk, and set the issue as Resolved?
On Sun, Feb 17, 2013 at 9:26 PM, Isaac Hebsh isaac.he...@gmail.com wrote:
Thank you Alex.
Atomic Update allows you to add new values into multivalued field, for
example... It means
Thank you Alex.
Atomic Update allows you to add new values into multivalued field, for
example... It means that the original document is being read (using
RealTimeGet, which depends on updateLog).
There is no reason that the list of operations (add/set/inc) will not
include a create-only
I opened a JIRA for this improvement request (attached a patch to
DistributedUpdateProcessor).
It's my first JIRA. please review it...
(Or, if someone has an easier solution, tell us...)
https://issues.apache.org/jira/browse/SOLR-4468
On Fri, Feb 15, 2013 at 8:13 AM, Isaac Hebsh isaac.he
created in the system? I think an external
create timestamp would be a lot more useful.
wunder
On Feb 16, 2013, at 12:37 PM, Isaac Hebsh wrote:
I opened a JIRA for this improvement request (attached a patch to
DistributedUpdateProcessor).
It's my first JIRA. please review
...@odoko.co.uk wrote:
I think what Walter means is make the thing that sends it to Solr set
the timestamp when it does so.
Upayavira
On Sat, Feb 16, 2013, at 08:56 PM, Isaac Hebsh wrote:
Hi,
I do have an externally-created timestamp, but some minutes may pass
before
it will be sent
them as a MUST clause, like
+(original query) +id:(1 2 3 4).
Third possibility, see https://issues.apache.org/jira/browse/SOLR-2429,
but
the short form is:
fq={!cache=false}restoffq
On Mon, Feb 11, 2013 at 2:41 PM, Isaac Hebsh isaac.he...@gmail.com
wrote:
Hi everyone.
I have queries
Hi everyone.
I have queries that should be bounded to a set of IDs (the uniqueKey field
of my schema).
My client front-end sends two Solr request:
In the first one, it wants to get the top X IDs. This result should return
very fast. No time to waste on highlighting. this is a very standard
query.
Shawn, what about 'flush to disk' behaviour on MMapDirectoryFactory?
On Fri, Feb 8, 2013 at 11:12 AM, Prakhar Birla prakharbi...@gmail.comwrote:
Great explanation Shawn! BTW soft commited documents will be not be
recovered on JVM crash.
On 8 February 2013 13:27, Shawn Heisey
Small addition:
To support query, I probably have to implement an analyzer (query time)...
An analyzer can be configured on numeric (i.e non TEXT) field?
On Thu, Feb 7, 2013 at 6:48 PM, Isaac Hebsh isaac.he...@gmail.com wrote:
Hi.
I have to index field which contains an IP address.
Users
. I feel that we can achieve some improvement in this
case...
On Mon, Feb 4, 2013 at 12:45 AM, Shawn Heisey s...@elyograg.org wrote:
On 2/3/2013 3:24 PM, Isaac Hebsh wrote:
Thanks Shawn for your quick answer.
When using collection name, Solr will choose the leader, when available
/2013 12:06 PM, Isaac Hebsh wrote:
LBHttpSolrServer is only solrj feature.. doesn't it?
I think that Solr does not balance queries among cores in the same server.
You can claim that it's a non-issue, if a single core can completely serve
multiple queries on the same time, and passing requests
works well here, Is utilizing all
the cores would not be useful?
On Sun, Feb 3, 2013 at 11:49 PM, Shawn Heisey s...@elyograg.org wrote:
On 2/3/2013 1:18 PM, Isaac Hebsh wrote:
Hi.
I have a SolrCloud cluster, which contains some servers. each server runs
multiple cores.
I want to distribute
, and the boost is pretty
impressive (roughly 2-5x faster for a complicated query)
Ming
On Mon, Jan 28, 2013 at 10:54 AM, Isaac Hebsh isaac.he...@gmail.com
wrote:
Does adding replicas (on additional servers) help to improve search
performance?
It is known that each query goes to all the shards
You can define a security filter in WEB-INF\web.xml, on specific url
patterns.
You might want to set the url pattern to /admin/*.
[find examples here:
http://stackoverflow.com/questions/7920092/how-can-i-bypass-security-filter-in-web-xml
]
On Sun, Jan 27, 2013 at 8:07 PM, Mingfeng Yang
...
On Thu, Jan 24, 2013 at 3:31 AM, Otis Gospodnetic
otis.gospodne...@gmail.com wrote:
Hi,
I think trie type fields add value only if you do range queries in them and
it sounds like that is bit your use case.
Otis
Solr ElasticSearch Support
http://sematext.com/
On Jan 23, 2013 2:53 PM, Isaac
(http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html).
Find a bottleneck _then_ tune. Premature optimization and all
that
Several tens of millions of docs isn't that large unless the text
fields are enormous.
Best
Erick
On Sat, Jan 19, 2013 at 2:32 PM, Isaac Hebsh
. openSearcher=false makes sense when you are using
hard-commits together with soft-commits, as the soft-commit is dealing
with opening/closing searchers, you don't need hard commits to do it.
Tomás
On Fri, Jan 18, 2013 at 2:20 AM, Isaac Hebsh isaac.he...@gmail.com
wrote:
Unfortunately
integrity. Not to mention that your tlog will be huge.
Not to mention that there is some memory usage for each document in
the tlog. Hard commits roll over the tlog, flush the in-memory tlog
pointers, close index segments, etc.
Best
Erick
On Thu, Jan 17, 2013 at 1:29 PM, Isaac Hebsh
62 matches
Mail list logo