specific, have real effect.
Thanks,
roman
On Tue, Jul 30, 2013 at 11:01 PM, Shawn Heisey s...@elyograg.org
wrote:
On 7/30/2013 6:59 PM, Roman Chyla wrote:
I have been wanting some tools for measuring performance of SOLR,
similar
to Mike McCandles' lucene benchmark.
so
self.gen.next()
File solrjmeter.py, line 229, in changed_dir
os.chdir(new)
OSError: [Errno 20] Not a directory:
'/home/dmitry/projects/lab/solrjmeter/queries/demo/demo.queries'
Best,
Dmitry
On Wed, Jul 31, 2013 at 7:21 PM, Roman Chyla roman.ch...@gmail.com
wrote:
Hi Dmitry,
probably
/2013 6:59 PM, Roman Chyla wrote:
I have been wanting some tools for measuring performance of SOLR,
similar
to Mike McCandles' lucene benchmark.
so yet another monitor was born, is described here:
http://29min.wordpress.com/2013/07/31/measuring-solr-query-performance/
I tested
When you set your cache (solrconfig.xml) to size=0, you are not using a
cache. so you can debug more easily
roman
On Thu, Aug 1, 2013 at 1:12 PM, jimtronic jimtro...@gmail.com wrote:
I have a query that runs slow occasionally. I'm having trouble debugging it
because once it's cached, it runs
Hi, here is a short post describing the results of the yesterday run with
added parameters as per Shawn's recommendation, have fun getting confused ;)
http://29min.wordpress.com/2013/08/01/measuring-solr-performance-ii/
roman
On Wed, Jul 31, 2013 at 12:32 PM, Roman Chyla roman.ch...@gmail.com
On Thu, Aug 1, 2013 at 6:11 PM, Shawn Heisey s...@elyograg.org wrote:
On 8/1/2013 2:08 PM, Roman Chyla wrote:
Hi, here is a short post describing the results of the yesterday run with
added parameters as per Shawn's recommendation, have fun getting confused
;)
http://29min.wordpress.com
.items():
Dmitry
On Thu, Aug 1, 2013 at 6:41 PM, Roman Chyla roman.ch...@gmail.com wrote:
Dmitry,
Can you post the entire invocation line?
roman
On Thu, Aug 1, 2013 at 7:46 AM, Dmitry Kan solrexp...@gmail.com wrote:
Hi Roman,
When I try to run with -q
/home/dmitry
/demo.queries, but there is no such path in the fresh
checkout.
Nice to have the -t param.
Dmitry
On Sat, Aug 3, 2013 at 5:01 AM, Roman Chyla roman.ch...@gmail.com wrote:
Hi Dmitry,
Thanks, It was a toothing problem, fixed now, please try the fresh
checkout
AND add the following to your
, Dmitry Kan wrote:
Of three URLs you asked for, only the 3rd one gave response:
snip
The rest report 404.
On Mon, Aug 5, 2013 at 8:38 PM, Roman Chyla roman.ch...@gmail.com
wrote:
Hi Dmitry,
So I think the admin pages are different on your version of solr, what
do
you
Thanks!
Dmitry
On Wed, Aug 7, 2013 at 6:54 AM, Roman Chyla roman.ch...@gmail.com
wrote:
Hi Dmitry,
I've modified the solrjmeter to retrieve data from under the core (the
-t
parameter) and the rest from the /solr/admin - I could test it only
against
4.0
On Fri, Aug 9, 2013 at 11:29 AM, Mark static.void@gmail.com wrote:
*All* of the terms in the field must be matched by the querynot
vice-versa.
Exactly. This is why I was trying to explain it as a reverse search.
I just realized I describe it as a *large list of known keywords when
On Fri, Aug 9, 2013 at 2:56 PM, Chris Hostetter hossman_luc...@fucit.orgwrote:
: I'll look into this. Thanks for the concrete example as I don't even
: know which classes to start to look at to implement such a feature.
Either roman isn't understanding what you are aksing for, or i'm not --
In case it matters: Python 2.7.3, ubuntu, solr 4.3.1.
Thanks,
Dmitry
On Thu, Aug 8, 2013 at 2:22 AM, Roman Chyla roman.ch...@gmail.com wrote:
Hi Dmitry,
The command seems good. Are you sure your shell is not doing something
funny with the params? You could try:
python solrjmeter.py
at 0x7fc6d4040fd0 is not JSON
serializable
Regards,
D.
On Tue, Aug 13, 2013 at 8:10 AM, Roman Chyla roman.ch...@gmail.com
wrote:
Hi Dmitry,
On Mon, Aug 12, 2013 at 9:36 AM, Dmitry Kan solrexp...@gmail.com
wrote:
Hi Roman,
Good point. I managed to run the command with -C
turnarounds,
Dmitry
On Wed, Aug 14, 2013 at 1:32 AM, Roman Chyla roman.ch...@gmail.com
wrote:
Hi Dmitry, oh yes, late night fixes... :) The latest commit should make
it
work for you.
Thanks!
roman
On Tue, Aug 13, 2013 at 3:37 AM, Dmitry Kan solrexp...@gmail.com
wrote:
Hi
=/admin/cores (which suggests that this is the right value
to
be
used for cores), and not with adminPath=/admin.
Bottom line, these core configuration is not self-evident.
Dmitry
On Fri, Aug 23, 2013 at 4:18 AM, Roman Chyla roman.ch...@gmail.com
wrote
)
at kg.apc.cmd.UniversalRunner.clinit(UniversalRunner.java:55)
at
kg.apc.cmd.UniversalRunner.buildUpdatedClassPath(UniversalRunner.java:109)
at kg.apc.cmd.UniversalRunner.clinit(UniversalRunner.java:55)
On Tue, Sep 3, 2013 at 2:50 AM, Roman Chyla roman.ch...@gmail.com wrote:
Hi Dmitry
You don't need to index fields several times, you can index is just into
one field, and use the different query analyzers just to build the query.
We're doing this for authors, for example - if query language says
=author:einstein, the query parser knows this field should be analyzed
differently
online at:
http://www.cfa.harvard.edu/hr/postings/13-32.html
Thank you,
Roman
--
Dr. Roman Chyla
ADS, Harvard-Smithsonian Center for Astrophysics
roman.ch...@gmail.com
David,
We have a similar query in astrophysics, an user can select an area of the
skymany stars out there
I am long overdue in creating a Jira issue, but here you have another
efficient mechanism for searching large number of ids
i just tested it whether our 'beautifu' parser supports it, and funnily
enough, it does :-)
https://github.com/romanchyla/montysolr/commit/f88577345c6d3a2dbefc0161f6bb07a549bc6b15
but i've (kinda) given up hope that people need powerful query parsers in
the lucene world, the LUCENE-5014 is there
Hi Parvesh,
I think you should check the following jira
https://issues.apache.org/jira/browse/SOLR-5379. You will find there links
to other possible solutions/problems:-)
Roman
On 28 Oct 2013 09:06, Erick Erickson erickerick...@gmail.com wrote:
Consider setting expand=true at index time. That
Hi Antoine,
I'll permit myself to respond in English, cause my written French is
slower;-)
Your problem is a well known amongst Sold users, the query parser splits
tokens by empty space, so the analyser never sees input 'la redoutte' but
it receives 'la' 'reroute'. You can of course enclose your
Hello,
We have two solr searchers/instances (read-only). They read the same index,
but they did not return the same #hits for a particular query
Log is below, but to summarize: first server always returns 576 hits, the
second server returns: 440, 440, 576, 576...
These are just few seconds
/Appinions | g+:
plus.google.com/appinions
https://plus.google.com/u/0/b/112002776285509593336/112002776285509593336/posts
w: appinions.com http://www.appinions.com/
On Wed, Nov 6, 2013 at 4:23 PM, Roman Chyla roman.ch...@gmail.com wrote:
Hello,
We have two solr searchers/instances (read
/Appinions | g+:
plus.google.com/appinions
https://plus.google.com/u/0/b/112002776285509593336/112002776285509593336/posts
w: appinions.com http://www.appinions.com/
On Wed, Nov 6, 2013 at 6:40 PM, Roman Chyla roman.ch...@gmail.com wrote:
No, and I should add that this query was not against
Hi,
docids are 'ephemeral', but i'd still like to build a search cache with
them (they allow for the fastest joins).
i'm seeing docids keep changing with updates (especially, in the last index
segment) - as per
https://issues.apache.org/jira/browse/LUCENE-2897
That would be fine, because i could
with openSearcher=false don't open new searchers, which
is why changes aren't visible until a softCommit or a hard commit with
openSearcher=true despite the fact that the segments are closed.
FWIW,
Erick
Best
Erick
On Sat, Nov 23, 2013 at 12:40 AM, Roman Chyla roman.ch...@gmail.com
wrote:
Hi
of
operations,
which is something I'm not all that familiar with so I'll leave
explanations
to others.
Thank you, it is useful to get insights from various sides,
roman
On Sat, Nov 23, 2013 at 8:22 PM, Roman Chyla roman.ch...@gmail.com
wrote:
Hi Erick,
Many thanks for the info
,
which is something I'm not all that familiar with so I'll leave
explanations
to others.
On Sat, Nov 23, 2013 at 8:22 PM, Roman Chyla roman.ch...@gmail.com
wrote:
Hi Erick,
Many thanks for the info. An additional question:
Do i understand you correctly that when two segmets get merged
a state (of previous index) - as they can be shared by threads that
build the cache
Best,
roman
On Sat, Nov 23, 2013 at 9:40 AM, Roman Chyla roman.ch...@gmail.com
wrote:
Hi,
docids are 'ephemeral', but i'd still like to build a search cache with
them (they allow for the fastest joins
:54 PM, Roman Chyla roman.ch...@gmail.com wrote:
On Mon, Nov 25, 2013 at 12:54 AM, Mikhail Khludnev
mkhlud...@griddynamics.com wrote:
Roman,
I don't fully understand your question. After segment is flushed it's
never
changed, hence segment-local docids are always the same. Due to merge
Hi,
I'd like to check - there is something I don't understand about cache - and
I don't know if it is a bug, or feature
the following calls return a cache
FieldCache.DEFAULT.getTerms(reader, idField);
FieldCache.DEFAULT.getInts(reader, idField, false);
the resulting arrays *will* contain
expected. Segments are write-once. It's been
a long standing design that deleted data will be
reclaimed on segment merge, but not before. It's
pretty expensive to change the terms loaded on the
fly to respect deleted document's removed data.
Best,
Erick
On Wed, Nov 27, 2013 at 4:07 PM, Roman
Isaac, is there an easy way to recognize this problem? We also index
synonym tokens in the same position (like you do, and I'm sure that our
positions are set correctly). I could test whether the default similarity
factory in solrconfig.xml had any effect (before/after reindexing).
--roman
On
I would be curious what the cause is. Samarth says that it worked for over
a year /and supposedly docs were being added all the time/. Did the index
grew considerably in the last period? Perhaps he could attach visualvm
while it is in the 'black hole' state to see what is actually going on. I
objects
with holding to some big object etc/. Btw if i study the graph, i see that
there *are* warning signs. That's the point of testing/measuring after all,
IMHO.
--roman
On 8 Feb 2014 13:51, Shawn Heisey s...@elyograg.org wrote:
On 2/8/2014 11:02 AM, Roman Chyla wrote:
I would be curious what
And perhaps one other, but very pertinent, recommendation is: allocate only
as little heap as is necessary. By allocating more, you are working against
the OS caching. To know how much is enough is bit tricky, though.
Best,
roman
On Wed, Feb 12, 2014 at 2:56 PM, Shawn Heisey
Hi Rajeev,
You can take this:
http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201307.mbox/%3CCAEN8dyX_Am_v4f=5614eu35fnhb5h7dzkmkzdfwvrrm1xpq...@mail.gmail.com%3E
I haven't created the jira yet, but I have improved the plugin. Recently, I
have seen a use case of passing 90K identifiers
Hi Tri,
Look at this:
http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201307.mbox/%3CCAEN8dyX_Am_v4f=5614eu35fnhb5h7dzkmkzdfwvrrm1xpq...@mail.gmail.com%3E
Roman
On 13 Feb 2014 03:39, Tri Cao tm...@me.com wrote:
Hi Joel,
Thanks a lot for the suggestion.
After thinking more about
perhaps useful, here is an open source implementation with near[digit]
support, incl analysis of proximity tokens. When days become longer maybe
itwill be packaged into a nice lib...:-)
https://github.com/romanchyla/montysolr/blob/master/contrib/adsabs/grammars/ADS.g
On 25 Mar 2014 00:14, Salman
Hi, What will replace spans, if spans are nuked ?
Roman
On 17 May 2014 09:15, Ahmet Arslan iori...@yahoo.com wrote:
Hi,
Payloads are used to store arbitrary data along with terms. You can
influence score with these arbitrary data.
See :
+1, additionally (as it follows from your observation) the query can get
out of sync with the index, if eg it was saved for later use and ran
against newly opened searcher
Roman
On 4 Dec 2014 10:51, Darin Amos dari...@gmail.com wrote:
Hello All,
I have been doing a lot of research in building
keys, hence it exclude such leakage across different
searchers.
On Fri, Dec 5, 2014 at 6:43 AM, Roman Chyla roman.ch...@gmail.com wrote:
+1, additionally (as it follows from your observation) the query can get
out of sync with the index, if eg it was saved for later use and ran
against newly
?
I might not have followed you, this discussing challenges my understanding
of Lucene and SOLR.
Darin
On Dec 5, 2014, at 12:47 PM, Roman Chyla roman.ch...@gmail.com wrote:
Hi Mikhail, I think you are right, it won't be problem for SOLR, but it
is
likely an antipattern inside
Hi Leonid,
I didn't look into solr qparser for a long time, but I think you should be
able to combine different query parsers in one query. Look at the
SolrQueryParser code, maybe now you can specify custom query parser for
every clause (?), st like:
foo AND {!lucene}bar
I dont know, but worth
I think this makes sense to (ie. the setup), since the search is getting 1K
documents each time (for textual analysis, ie. they are probably large
docs), and use Solr as a storage (which is totally fine) then the parallel
multiple drive i/o shards speed things up. The index is probably large, so
Hi everybody,
There exists a new open-source implementation of a search interface for
SOLR. It is written in Javascript (using Backbone), currently in version
v1.0.19 - but new features are constantly coming. Rather than describing it
in words, please see it in action for yourself at
,
Roman
On 30 Jan 2015 21:51, Shawn Heisey apa...@elyograg.org wrote:
On 1/30/2015 1:07 PM, Roman Chyla wrote:
There exists a new open-source implementation of a search interface for
SOLR. It is written in Javascript (using Backbone), currently in version
v1.0.19 - but new features are constantly
),
but that was one year ago...
On Tue, Jan 6, 2015 at 5:20 PM, Vishal Swaroop vishal@gmail.com wrote:
Thanks Roman... I will check it... Maybe it's off topic but how about
Angular...
On Jan 6, 2015 5:17 PM, Roman Chyla roman.ch...@gmail.com wrote:
Hi Vishal, Alexandre,
Here is another one
Hi Vishal, Alexandre,
Here is another one, using Backbone, just released v1.0.16
https://github.com/adsabs/bumblebee
you can see it in action: http://ui.adslabs.org/
While it primarily serves our own needs, I tried to architect it to be
extendible (within reasonable limits of code, man power)
I'm not sure I understand - the autophrasing filter will allow the
parser to see all the tokens, so that they can be parsed (and
multi-token synonyms) identified. So if you are using the same
analyzer at query and index time, they should be able to see the same
stuff.
are you using multi-token
should have retrieved it; but it doesnt.
What could I be doing wrong?
On Wed, Apr 29, 2015 at 2:10 AM, Roman Chyla roman.ch...@gmail.com
wrote:
I'm not sure I understand - the autophrasing filter will allow the
parser to see all the tokens, so that they can be parsed (and
multi-token
,
start: 0,
docs: []
},
debug: {
rawquerystring: tween 20,
querystring: tween 20,
parsedquery: name:tweenx20,
parsedquery_toString: name:tweenx20,
explain: {},
Thank you,
Kaushik
On Wed, Apr 29, 2015 at 4:00 PM, Roman Chyla roman.ch...@gmail.com
wrote
It shouldn't matter. Btw try a url instead of a file path. I think the
underlying loading mechanism uses java File , it could work.
On May 4, 2015 2:07 AM, Zheng Lin Edwin Yeo edwinye...@gmail.com wrote:
Would like to check, will this method of splitting the synonyms into
multiple files use up
, Roman Chyla roman.ch...@gmail.com
wrote:
Hi Kaushik, I meant to compare tween 20 against tween 20.
Your autophrase filter replaces whitespace with x, but your synonym
filter
expects whitespaces. Try that.
Roman
On Apr 29, 2015 2:27 PM, Kaushik kaushika...@gmail.com wrote:
Hi
Hi,
inStockSkusBitSet.get(currentChildDocNumber)
Is that child a lucene id? If yes, does it include offset? Every index
segment starts at a different point, but docs are numbered from zero. So to
check them against the full index bitset, I'd be doing
Bitset.exists(indexBase + docid)
Just one
I've taken the route of extending solr, the repo checks out solr and builds
on top of that. The hard part was to figure out how to use solr test
classes and the default location for integration tests, but once there, it
is relatively easy. Google for montysolr, the repo is on github.
Roman
On Oct
Or you could also apply XSL to returned records:
https://wiki.apache.org/solr/XsltResponseWriter
On Thu, Oct 8, 2015 at 5:06 PM, Uwe Reh wrote:
> Hi,
>
> my suggestions are probably to simple, because they are not a real
> protection of privacy. But maybe one fits
I'd like to offer another option:
you say you want to match long query into a document - but maybe you
won't know whether to pick "Mad Max" or "Max is" (not mentioning the
performance hit of "*mad max*" search - or is it not the case
anymore?). Take a look at the NGram tokenizer (say size of 2;
Hi,
I'm hoping someone has seen/encountered a similar problem. We have
solr instances with all Jetty threads in BLOCKED state. The
application does not respond to any http requests.
It is SOLR 4.9 running inside docker on Amazon EC2. Jetty is 8.1 and
there is an nginx proxy in front of it (with
are available.
--roman
On Tue, Aug 16, 2016 at 9:54 PM, Joel Bernstein <joels...@gmail.com> wrote:
> You'll want to use org.apache.lucene.index.DocValues. The DocValues api has
> replaced the field cache.
>
>
>
>
>
> Joel Bernstein
> http://joelsolr.blogspot.com/
>
>
I need to read data from the index in order to build a special cache.
Previously, in SOLR4, this was accomplished with FieldCache or
DocTermOrds
Now, I'm struggling to see what API to use, there is many of them:
on lucene level:
UninvertingReader.getNumericDocValues (and others)
}
transformer.process(docBase, i);
i++;
}
}
}
}
On Wed, Aug 17, 2016 at 1:22 PM, Roman Chyla <roman.ch...@gmail.com> wrote:
> Joel, thanks, but which of them? I've counted at least 4, if not more,
> different ways of how to get DocValues. Are there many functi
Hello,
We have a use case of a very large index (slave-master; for unrelated
reasons the search cannot work in the cloud mode) - one of the fields is a
very large text, stored mostly for highlighting. To cut down the index size
(for purposes of replication/scaling) I thought I could try to save
Elasticsearch Consulting Support Training - http://sematext.com/
>
>
>
> > On 20 Feb 2018, at 20:39, Roman Chyla <roman.ch...@gmail.com> wrote:
> >
> > Say there is a high load and I'd like to bring a new machine and let it
> > replicate the index, if 10
east.
>
> On Tue, Feb 20, 2018 at 10:27 AM, Roman Chyla <roman.ch...@gmail.com>
> wrote:
>
> > Hello,
> >
> > We have a use case of a very large index (slave-master; for unrelated
> > reasons the search cannot work in the cloud mode) - one of the fields is
101 - 167 of 167 matches
Mail list logo