searcher?
-Is it a good idea to set
openSearcher=false in auto commit
and rely on soft auto commit to see new data in searches?
thanks
Matteo Grolla
I'd like to have some suggestion on how to improve the indexing performance on
the following scenario
I'm uploading 1M docs to solr,
every docs has
id: sequential number
title: small string
date: date
body: 1kb of text
Here are my benchmarks (they are all
this quite frequently, 15 seconds seems
quite reasonable.
Best,
Erick
On Sun, Oct 6, 2013 at 12:19 PM, Matteo Grolla matteo.gro...@gmail.com
wrote:
I'd like to have some suggestion on how to improve the indexing performance
on the following scenario
I'm uploading 1M docs to solr,
every
Hi,
I'd really appreciate if you could give me some help understanding how
to tune the document cache.
My thoughts:
min values: max_results * max_concurrent_queries, as stated by
http://wiki.apache.org/solr/SolrCaching
how can I estimate max_concurrent_queries?
Hi everybody
can anyone give me a suitable interpretation for cat_rank in
http://people.apache.org/~hossman/ac2012eu/ slide 15
thanks
Thanks a lot
and thanks for pointing me at the video. I missed it
Matteo
Il giorno 05/mag/2014, alle ore 20:44, Chris Hostetter ha scritto:
: Hi everybody
: can anyone give me a suitable interpretation for cat_rank in
: http://people.apache.org/~hossman/ac2012eu/ slide 15
Have
Hi everybody,
I'm having troubles with the function query
query(subquery, default)
http://wiki.apache.org/solr/FunctionQuery#query
running this
http://localhost:8983/solr/select?q=query($qq,1)qq={!dismax qf=text}hard drive
on collection1 gives me no results
but I
Thanks very much,
i realized too late that that I skipped an important part of the wiki
documentation this example assumes /detType=func
thanks a lot
Il giorno 06/mag/2014, alle ore 21:05, Yonik Seeley ha scritto:
On Tue, May 6, 2014 at 5:08 AM, Matteo Grolla matteo.gro...@gmail.com
Hi,
I developed a new SolResponseWriter but I'm not happy with how I wrote
tests.
My problem is that I need to test it either with local request and with
distributed request since the solr response object (input to the response
writer) are different.
a) I tested the local request case
HI,
can anybody give me a confirm?
If I add multiple document with the same id but differing on other fields and
then issue a commit (no commits before this) the last added document gets
indexed, right?
H.p.
using solr 4 and default settings for optimistic locking.
Matteo
Thanks really a lot Yonik!
Il giorno 03/nov/2014, alle ore 15:51, Yonik Seeley ha scritto:
On Mon, Nov 3, 2014 at 8:53 AM, Matteo Grolla matteo.gro...@gmail.com wrote:
HI,
can anybody give me a confirm?
If I add multiple document with the same id but differing on other fields
Can anyone tell me the behavior of solr (and if it's consistent) when I do what
follows:
1) add document x
2) delete document x
3) commit
I've tried with solr 4.5.0 and document x get's indexed
Matteo
-Original Message- From: Matteo Grolla
Sent: Wednesday, November 5, 2014 4:47 AM
To: solr-user@lucene.apache.org
Subject: add and then delete same document before commit,
Can anyone tell me the behavior of solr (and if it's consistent) when I do
what follows:
1) add document x
2
Hi,
I'm thinking about having an instance of solr (SolrA) with all fields
stored and just id indexed in addition with a normal production instance of
solr (SolrB) that is used for the searches.
This would allow me to read only what changed from previous crawl, update SolrA
and send the
Wow!!!
thanks Joe!
Il giorno 02/feb/2015, alle ore 15:05, Joseph Obernberger ha scritto:
I have a similar use-case. Check out the export capability and using
cursorMark.
-Joe
On 2/2/2015 8:14 AM, Matteo Grolla wrote:
Hi,
I'm thinking about having an instance of solr
Hi,
hope someone can help me troubleshoot this issue.
I'm trying to setup a solrcloud cluster with
-zookeeper on 192.168.1.8 (osx mac)
-solr1 on 192.168.1.10 (virtualized ubuntu running on mac)
-solr2 on 192.168.1.3 (ubuntu on another pc)
the problem is
Solved!
ubuntu has an entry like this in /etc/hosts
127.0.1.1 hostname
to properly run solrcloud one must substitute 127.0.1.1 with a real (possibly
permanent) ip address
Il giorno 12/gen/2015, alle ore 12:47, Matteo Grolla ha scritto:
Hi,
hope someone can help me
Hi,
is there any public benchmark or description of how the solr stats
component works?
Matteo
Hi,
I tried performing a join query
{!join from=fA to=fB}
where fA was string and fB was text using keywordTokenizer
it doesn't work, but it does if either fields are both string or both
text.
If you confirm this is the correct behavior I'll
used the keywordTokenizer, was there other analysis such as
lowercasing going on?
-Yonik
On Mon, May 18, 2015 at 10:26 AM, Matteo Grolla matteo.gro...@gmail.com
wrote:
Hi,
I tried performing a join query
{!join from=fA to=fB}
where fA was string and fB was text
I wouldn't add the complexity
you're talking about, especially at the volumes you're talking.
Best,
Erick
On Thu, May 21, 2015 at 3:20 AM, Matteo Grolla matteo.gro...@gmail.com
wrote:
Hi
I'd like some feedback on how I'd like to solve the following
sharding problem
I have
Hi
I'd like some feedback on how I'd like to solve the following sharding problem
I have a collection that will eventually become big
Average document size is 1.5kb
Every year 30 Million documents will be indexed
Data come from different document producers (a person, owner of his documents)
Hi,
what is the performance impact of issuing a splitshard on a live node
used for searches?
I'm designing a solr cloud installation where nodes from a single cluster
are distributed on 2 datacenters which are close and very well connected.
let's say that zk nodes zk1, zk2 are on DC1 and zk2 is on DC2 and let's say
that DC1 goes down and the cluster is left with zk3.
how can I restore a
ver.wunderwood.org/ (my blog)
>
>
> > On Oct 29, 2015, at 10:08 AM, Matteo Grolla <matteo.gro...@gmail.com>
> wrote:
> >
> > I'm designing a solr cloud installation where nodes from a single cluster
> > are distributed on 2 datacenters which are close and ve
o take all the zookeeper nodes
> down.
>
> -- Pushkar Raste
> On Oct 29, 2015 4:33 PM, "Matteo Grolla" <matteo.gro...@gmail.com> wrote:
>
> > Hi Walter,
> > it's not a problem to take down zk for a short (1h) time and
> > reconfigure it. Meanwhile
Hi,
I'm doing this test
collection test is replicated on two solr nodes running on 8983, 8984
using external zk
1)turn on solr 8984
2)add,commit a doc x con solr 8983
3)turn off solr 8983
4)turn on solr 8984
5)shortly after (leader still not elected) turn on solr 8983
6)8984 is elected as
>
> On 15 October 2015 at 16:16, Matteo Grolla <matteo.gro...@gmail.com>
> wrote:
>
> > Hi,
> > I'm doing this test
> > collection test is replicated on two solr nodes running on 8983, 8984
> > using external zk
> >
> > 1)turn OFF solr 8984
ailure in 4.6 and a commit happened between the original
> insert and the delete? Just askin'...
>
> Best,
> Erick
>
> On Wed, Nov 18, 2015 at 8:21 AM, Matteo Grolla <matteo.gro...@gmail.com>
> wrote:
> > Thanks Shawn,
> >I'm aware that solr isn't transactional and
solr version.
2015-11-18 16:51 GMT+01:00 Shawn Heisey <apa...@elyograg.org>:
> On 11/18/2015 8:21 AM, Matteo Grolla wrote:
> > On Solr 4.10.3 I'm noting a different (desired) behaviour
> >
> > 1) add document x
> > 2) delete document x
> > 3) commit
On Solr 4.10.3 I'm noting a different (desired) behaviour
1) add document x
2) delete document x
3) commit
document x doesn't get indexed.
The question now is: Can I count on this behaviour or is it just incidental?
2014-11-05 22:21 GMT+01:00 Matteo Grolla <matteo.gro...@gmail.
time when the batch has errors and rely on Solr overwriting
> any docs in the batch that were indexed the first time.
>
> Best,
> Erick
>
> On Mon, Sep 28, 2015 at 2:27 PM, Matteo Grolla <matteo.gro...@gmail.com>
> wrote:
> > Hi,
> > if I need fine grained er
Hi,
if I need fine grained error reporting I use Http Solr server and send
1 doc per request using the add method.
I report errors on exceptions of the add method,
I'm using autocommit so I'm not seing errors related to commit.
Am I loosing some errors? Is there a better way?
Thanks
; I ask, because these settings can solve the problems you've mentioned
> > without the need to add any additional functionality.
> >
> > On Tue, Jan 5, 2016 at 9:04 PM Matteo Grolla <matteo.gro...@gmail.com>
> > wrote:
> >
> >> Hi Binoy,
> >>
Hi Luca,
not sure if I understood well. Your question is
"Why are index times on a solr cloud collecton with 2 replicas higher than
on solr cloud with 1 replica" right?
Well with 2 replicas all docs have to be deparately indexed in 2 places and
solr has to confirm that both indexing went
Hi,
after looking at the presentation of cloudsearch from lucene revolution
2014
https://www.youtube.com/watch?v=RI1x0d-yO8A=PLU6n9Voqu_1FM8nmVwiWWDRtsEjlPqhgP=49
min 17:08
I recognized I'd love to be able to remove the burden of disabling filter
query caching from developers
the problem:
olrCaching
> http://yonik.com/advanced-filter-caching-in-solr/
>
>
> On Tue, Jan 5, 2016 at 7:28 PM Matteo Grolla <matteo.gro...@gmail.com>
> wrote:
>
> > Hi,
> > after looking at the presentation of cloudsearch from lucene
> revolution
> > 2014
> >
lauses are very restrictive, I
> wonder what happens if
> you add a cost in. fq's are evaluated in cost order (when
> cache=false), so what happens
> in this case?
> ={!cache=false cost=101}n_rea:xxx={!cache=false
> cost=102}provincia:={!cache=false cost=103}type:
&g
Hi,
can you confirm me that realtime get requirements are just:
true
json
true
${solr.ulog.dir:}
Thanks Shawn,
On a production solr instance some cores take a long time to load
while other of similar size take much less. One of the differences between
these cores is the directoryFactory.
2016-01-12 15:34 GMT+01:00 Shawn Heisey <apa...@elyograg.org>:
> On 1/12/2016 2:50 A
ok,
suggester was responsible for the long time to load.
Thanks
2016-01-12 15:47 GMT+01:00 Matteo Grolla <matteo.gro...@gmail.com>:
> Thanks Shawn,
> On a production solr instance some cores take a long time to load
> while other of similar size take much less. One of
Hi,
I'm trying to optimize a solr application.
The bottleneck are queries that request 1000 rows to solr.
Unfortunately the application can't be modified at the moment, can you
suggest me what could be done on the solr side to increase the performance?
The bottleneck is just on fetching the
and it takes 15s
execute it with rows = 400 and it takes 3s
it seems that below rows = 400 times are acceptable, beyond they get slow
2016-02-11 11:27 GMT+01:00 Upayavira <u...@odoko.co.uk>:
>
>
> On Thu, Feb 11, 2016, at 09:33 AM, Matteo Grolla wrote:
> > Hi,
> > I'
Matteo Grolla <matteo.gro...@gmail.com>:
> Hi Yonic,
> after the first query I find 1000 docs in the document cache.
> I'm using curl to send the request and requesting javabin format to mimic
> the application.
> gc activity is low
> I managed to load the entire 50GB index
anymore.
Time improves now queries that took ~30s take <10s. But I hoped better
I'm going to use jvisualvm's sampler to analyze where time is spent
2016-02-11 15:25 GMT+01:00 Yonik Seeley <ysee...@gmail.com>:
> On Thu, Feb 11, 2016 at 7:45 AM, Matteo Grolla <matteo.gro...@gma
<jack.krupan...@gmail.com>:
> Is this a scenario that was working fine and suddenly deteriorated, or has
> it always been slow?
>
> -- Jack Krupansky
>
> On Thu, Feb 11, 2016 at 4:33 AM, Matteo Grolla <matteo.gro...@gmail.com>
> wrote:
>
> > Hi,
> >
[image: Immagine incorporata 1]
2016-02-11 16:05 GMT+01:00 Matteo Grolla <matteo.gro...@gmail.com>:
> I see a lot of time spent in splitOnTokens
>
> which is called by (last part of stack trace)
>
> BinaryResponseWriter$Resolver.writeResultsBody()
> ...
> solr.sea
re consuming the bulk of qtime.
>
> -- Jack Krupansky
>
> On Thu, Feb 11, 2016 at 11:33 AM, Matteo Grolla <matteo.gro...@gmail.com>
> wrote:
>
> > virtual hardware, 200ms is taken on the client until response is written
> to
> > disk
> > qtime on solr is
t;t...@statsbiblioteket.dk>:
> On Thu, 2016-02-11 at 11:53 +0100, Matteo Grolla wrote:
> > I'm working with solr 4.0, sorting on score (default).
> > I tried setting the document cache size to 2048, so all docs of a single
> > request fit (2 requests fit actually)
> &g
; What does they query look like? Is it complex or use wildcards or function
> queries, or is it very simple keywords? How many operators?
>
> Have you used the debugQuery=true parameter to see which search components
> are taking the time?
>
> -- Jack Krupansky
>
> On Thu,
nt modern hardware.
>
> -- Jack Krupansky
>
> On Thu, Feb 11, 2016 at 10:36 AM, Matteo Grolla <matteo.gro...@gmail.com>
> wrote:
>
> > Hi Jack,
> > response time scale with rows. Relationship doens't seem linear but
> > Below 400 rows times are much fas
Hi,
I'm experimenting the query rest api with solr 5.4 and I'm noticing
that query parameters are not logged in solr.log.
Here are query and log line
curl -XGET 'localhost:8983/solr/test/query' -d '{"query":"*:*"}'
2016-04-28 09:16:54.008 INFO (qtp668849042-17) [ x:test]
o.a.s.c.S.Request
c name_t:"white cat"
>
> Can you open a JIRA for this?
>
> -Yonik
>
>
> On Mon, May 16, 2016 at 10:23 AM, Matteo Grolla <matteo.gro...@gmail.com>
> wrote:
> > Hi everyone,
> > I have a problem with nested queries
> > I
Hi everyone,
I have a problem with nested queries
If the order is:
1) query
2) nested query (embedded in _query_:"...")
everything works fine
if it is the opposite, like this
you cannot use GET
> HTTP method ( -XGET ) and pass parameters in POST (-d).
>
> Try to remove the -XGET parameter.
>
> On Thu, Apr 28, 2016 at 11:18 AM, Matteo Grolla <matteo.gro...@gmail.com>
> wrote:
>
> > Hi,
> > I'm experimenting the query rest api wi
Hi Alessandro,
your shot in the dark was interesting, but the behaviour doesn't
depend on the field being mandatory, it works like this for every field. So
it seems just wrong
df=field=*
should be translated as field:*
not as *:*
2016-07-28 10:32 GMT+02:00 Matteo Grolla <matteo.
(field);
// *:* -> MatchAllDocsQuery
if ("*".equals(termStr)) {
if ("*".equals(field) || getExplicitField() == null) {
return newMatchAllDocsQuery();
}
}
2016-07-28 9:40 GMT+02:00 Matteo Grolla <matteo.gro...@gmail.com>:
> I noticed the behaviour
I noticed the behaviour in solr 4.10 and 5.4.1
2016-07-28 9:36 GMT+02:00 Matteo Grolla <matteo.gro...@gmail.com>:
> Hi,
> I'm surprised by lucene query parser translating this query
>
> http://localhost:8983/solr/collection1/select?df=id=*
>
> in
>
>
Hi,
I'm surprised by lucene query parser translating this query
http://localhost:8983/solr/collection1/select?df=id=*
in
MatchAllDocsQuery(*:*)
I was expecting it to execute: "id:*"
is it a bug or a desired behaviour? If desired can you explain why?
Hi,
the export handler returns 0 for null numeric values.
Can someone explain me why it doesn't leave the field off the record like
string or multivalue fields?
thanks
Matteo
Hi,
is there a reason why the export handler doesn't support date fields?
thanks
Matteo Grolla
It seems me that the estimation in MB is in fact an estimation in GB
the formula includes the avg doc size, which is in kb, so the result is in
kb and should be divided by 1024 to obtain the result in MB.
But it's divided by 1024*1024
Right Alessandro that's another bug
Cheers
2017-04-27 12:30 GMT+02:00 alessandro.benedetti :
> +1
> I would add that what is called : "Avg. Document Size (KB)" seems more to
> me
> "Avg. Field Size (KB)".
> Cheers
>
>
>
> -
> ---
> Alessandro Benedetti
>
Hi everyone,
I'm evaluating suggesters that can can be in near real time and I came
across
https://issues.apache.org/jira/browse/LUCENE-5477.
Is there a way to use this functionality from solr?
Thanks very much
Matteo Grolla
Hi,
on solr 4 the log contained informations about time spent and memory
consumed uninverting a field.
Where can I find this information on current version of solr?
Thanks
--excerpt from solr 4.10 log--
INFO - 2018-04-09 15:57:58.720; org.apache.solr.request.UnInvertedField;
UnInverted
Hi everybody,
I'm facing the same problem on solr 7.3.
Probably requesting a longer session to zk (the default 10s seems too
short) will solve the problem but I'm puzzled by the fact that this error
is reported by solrj as a SolrException with status code 400 (BAD_REQUEST).
in ZkStateReader
66 matches
Mail list logo