Re: ltr (reranking) in combination with cursorMarks

2020-08-30 Thread Dmitry Kan
gt; freiheit.com technologies gmbh > Budapester Straße 45 > 20359 Hamburg / Germany > fon: +49 40 / 890584-0 > Hamburg HRB 70814 > > +++ Hamburg/ Germany + Lisbon/ Portugal +++ > > https://www.freiheit.com > https://www.facebook.com/freiheitcom > > B444 034F 9C95 A569 C5DA 087C E6B9 CCF9 5572 A904 > Geschäftsführer: Claudia Dietze, Stefan Richter > -- Dmitry Kan Luke Toolbox: http://github.com/DmitryKey/luke Blog: http://dmitrykan.blogspot.com Twitter: http://twitter.com/dmitrykan SemanticAnalyzer: https://semanticanalyzer.info

Re: Creating a phrase match feature in LTR

2020-08-28 Thread Dmitry Kan
the exception > > Exception from createWeight for SolrFeature [name=phraseMatch, > params={q={!complexphrase inOrder=true}query(fieldName:${input})}] null > > But similar query works when used in the query reranking construct with > these params > > rqq: "{!complexphrase inOrder=true v=$v1}", > v1: "query(fieldName:"some text"~2^1.0,0)", > > What is the problem in the LTR configuration for the feature ? > -- Dmitry Kan Luke Toolbox: http://github.com/DmitryKey/luke Blog: http://dmitrykan.blogspot.com Twitter: http://twitter.com/dmitrykan SemanticAnalyzer: https://semanticanalyzer.info

Re: Ranking issue when combining sorting and re-ranking on SolrCloud (multiple shards)

2020-08-28 Thread Dmitry Kan
quot;: 13.956842, > > "score": 0.17357588 > > }, > > { "id": "6512", > > "$sort_score": 14.43907, > > "score": 0.11575622 > > }, > > > > We also tried with other simple re-rank queries apart from LTR, and the > > issue persisted. > > > > Could someone please help troubleshoot? Ideally, we would want to have > the > > re-rank results merged on the single node, and not re-apply sorting. > > > > Thank you! > > > -- Dmitry Kan Luke Toolbox: http://github.com/DmitryKey/luke Blog: http://dmitrykan.blogspot.com Twitter: http://twitter.com/dmitrykan SemanticAnalyzer: https://semanticanalyzer.info

Re: Ranking issue when combining sorting and re-ranking on SolrCloud (multiple shards)

2020-08-28 Thread Dmitry Kan
> > > "$sort_score": 14.612957, > > > "score": 0.19214153 > > > }, > > > { "id": "1523", > > > "$sort_score": 14.4093275, > > > "score": 0.26738763 > > > }, > > > {

Re: Rerank for distributed requests

2020-08-28 Thread Dmitry Kan
avior. > > I'm curious if current behavior is intended or not, typically I would > expect either something I described above or at least ignoring sort during > the merge and using only doc.score that was generated by LTR rescorer. > Maybe the community would be interested in the ap

Re: Ranking issue when combining sorting and re-ranking on SolrCloud (multiple shards)

2020-08-28 Thread Dmitry Kan
, > { "id": "6704", > "$sort_score": 13.956842, > "score": 0.17357588 > }, > { "id": "6512", > "$sort_score": 14.43907, > "score": 0.11575622 > }, > > We also tried with other simple re

Re: Issues deploying LTR into SolrCloud

2020-08-26 Thread Dmitry Kan
deployment status per collection in the admin UI? Thanks, Dmitry On Tue, Aug 25, 2020 at 6:20 PM Dmitry Kan wrote: > Hi, > > There is a recent thread "Replication of Solr Model and feature store" on > deploying LTR feature store and model into a master/slave Solr t

Issues deploying LTR into SolrCloud

2020-08-25 Thread Dmitry Kan
SolrCloud? Is there any workaround I can try, like saving the feature store and model JSON files into the collection config path and creating the SolrCloud from there? Thanks, Dmitry -- Dmitry Kan Luke Toolbox: http://github.com/DmitryKey/luke Blog: http://dmitrykan.blogspot.com and https://

Re: Questions about corrupted Segments files.

2019-11-06 Thread Dmitry Kan
xpected extra argument '-fix' > > > > If anybody knows about either a way to fix corrupted segment files or a > way to use checkIndex '-fix' option correctly, could you please let me > know? > > Any clue will be very appreciated. > > Sincerely, >

question on MLT params

2019-05-20 Thread Dmitry Kan
org/solr/TermVector> support." Will the tokens be parsed in the order of appearance in the stored field (same as raw input) or some prioritization like TF*IDF is going to be applied? Thanks, Dmitry -- Dmitry Kan Luke Toolbox: http://github.com/DmitryKey/luke Blog: http://dmitrykan.blogspot.c

Re: tf function query

2017-10-12 Thread Dmitry Kan
gt; Erick > > On Thu, Oct 5, 2017 at 3:14 AM, Dmitry Kan <solrexp...@gmail.com> wrote: > > Hi, > > > > According to > > https://lucene.apache.org/solr/guide/6_6/function- > queries.html#FunctionQueries-AvailableFunctions > > > > tf(field, term) req

tf function query

2017-10-05 Thread Dmitry Kan
don't use edismax parser to apply multifield boosts, but instead use a custom ranking function. Would appreciate any thoughts, Dmitry -- Dmitry Kan Luke Toolbox: http://github.com/DmitryKey/luke Blog: http://dmitrykan.blogspot.com Twitter: http://twitter.com/dmitrykan SemanticAnalyzer: https

ClassCastException in RelevanceComparator

2017-09-19 Thread Dmitry Kan
) at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:532) at java.lang.Thread.run(Thread.java:745) Would tint fields be causing this? If so, should they be defined as Floats? Thanks, Dmitry -- Dmitry Kan Luke Toolbox: http://github.com/DmitryKey/luke Blog: http

null's in logging

2017-04-07 Thread Dmitry Kan
org.apache.solr.update.DirectUpdateHandler2 *null* - Reordered DBQs detected. Is this a known issue to have *null* or a misconfig on our part? Thanks, Dmitry -- Dmitry Kan Luke Toolbox: http://github.com/DmitryKey/luke Blog: http://dmitrykan.blogspot.com Twitter: http://twitter.com/dmitrykan

sort by function with cursor based result fetching

2017-03-05 Thread Dmitry Kan
in solr 6.x? Thanks! -- Dmitry Kan Luke Toolbox: http://github.com/DmitryKey/luke Blog: http://dmitrykan.blogspot.com Twitter: http://twitter.com/dmitrykan Insider Solutions: https://semanticanalyzer.info

[ANNOUNCEMENT] Luke 6.4.1 released

2017-02-12 Thread Dmitry Kan
Download the release zip here: https://github.com/DmitryKey/luke/releases/tag/luke-6.4.1 Upgrade to Lucene 6.4.1. Supports: Apache Solr 6.4.1 Elasticsearch 5.2.0 Pull-requests: #79 <https://github.com/DmitryKey/luke/pull/79> and #80 <https://github.com/DmitryKey/luke/pull/80>. --

Re: [Result Query Solr] How to retrieve the content of pdfs

2016-09-20 Thread Dmitry Kan
Hi Alexandre, Could you add fl=* to your query and check the output? Alternatively, have a look at your schema file and check what could look like content field: text or similar. Dmitry 14 сент. 2016 г. 1:27 AM пользователь "Alexandre Martins" < alexandremart...@gmail.com> написал: > Hi Guys,

RE: Where is Stored values resides ?

2016-07-23 Thread Dmitry Kan
Hi, To my best knowledge the getopt luke is not supported anymore. Use this instead: https://github.com/DmitryKey/luke Regards, Dmitry Hi Prabaharan, You can use Luke to open an index. http://www.getopt.org/luke/ -Original Message- From: Rajendran, Prabaharan

Re: puzzling StemmerOverrideFilterFactory

2016-06-30 Thread Dmitry Kan
e mailing list, so I don't recommend using them. > > If you put an expiration date on whatever you use, make it at least one > month out. > > I see that you mentioned this on IRC as well, EARLY in the morning for > me. I will be sporadically checking there. > > Thanks,

Re: puzzling StemmerOverrideFilterFactory

2016-05-19 Thread Dmitry Kan
gt; > > > Hello! > > > > > > > > Puzzling case: there is a > > class="solr.StemmerOverrideFilterFactory" > > > > dictionary="stemdict.txt" /> on query side, but not indexing. One > rule is > > > > mapping organization onto organ

Re: puzzling StemmerOverrideFilterFactory

2016-05-19 Thread Dmitry Kan
nowballPorterFilterFactory will stem organization to organ. Still > > searching with organization finds it in the index. Anybody has an idea > why > > this happens? > > > > This is on solr 4.10.2. > > > > Thanks, > > Dmitry > > >

puzzling StemmerOverrideFilterFactory

2016-05-19 Thread Dmitry Kan
? This is on solr 4.10.2. Thanks, Dmitry -- Dmitry Kan Luke Toolbox: http://github.com/DmitryKey/luke Blog: http://dmitrykan.blogspot.com Twitter: http://twitter.com/dmitrykan SemanticAnalyzer: www.semanticanalyzer.info

[ANNOUNCEMENT] Luke 6.0.0 released

2016-04-18 Thread Dmitry Kan
Download the release zip here: https://github.com/DmitryKey/luke/releases/tag/luke-6.0.0 Major upgrade to new Lucene 6.0.0 API. #55 <https://github.com/DmitryKey/luke/pull/55> Enjoy! -- Dmitry Kan Luke Toolbox: http://github.com/DmitryKey/luke Blog: http://dmitrykan.blogspot.com T

[ANNOUNCEMENT] Luke 5.5.0 released

2016-03-19 Thread Dmitry Kan
Download the release zip here: https://github.com/DmitryKey/luke/releases/tag/luke-5.5.0 <https://github.com/DmitryKey/luke/releases/tag/luke-5.4.0> Fixed in this release: #50 <https://github.com/DmitryKey/luke/issues/50> (Literally, the upgrade to Lucene 5.5.0) Enjoy! -- Dmi

Re: [Migration Solr4 to Solr5] Collection reload error

2016-03-10 Thread Dmitry Kan
Thanks Shawn, Missed the openSearcher=false setting. So another thing to check really is whether there are concurrent commitWithin calls ever to the same shard. 10 марта 2016 г. 4:39 PM пользователь "Shawn Heisey" <apa...@elyograg.org> написал: > On 3/10/2016 3:05 A

Re: [Migration Solr4 to Solr5] Collection reload error

2016-03-10 Thread Dmitry Kan
2 - Incremental : > > - Add or delete documents from the main collection >solrClient.add(doc, 180) // commitWithin > == 30 mn > solrClient.deleteById(doc, 180) // commitWithin == 30 mn > > Maybe you will spot something obviously w

Re: [Migration Solr4 to Solr5] Collection reload error

2016-03-04 Thread Dmitry Kan
gt; > Gérald and Elodie > > > Kelkoo SAS > Société par Actions Simplifiée > Au capital de € 4.168.964,30 > Siège social : 158 Ter Rue du Temple 75003 Paris > 425 093 069 RCS Paris > > Ce message et les pièces jointes sont confidentiels et établis à > l'attention exclusive de leurs destinataires. Si vous n'êtes pas le > destinataire de ce message, merci de le détruire et d'en avertir > l'expéditeur. > -- Dmitry Kan Luke Toolbox: http://github.com/DmitryKey/luke Blog: http://dmitrykan.blogspot.com Twitter: http://twitter.com/dmitrykan SemanticAnalyzer: www.semanticanalyzer.info

change default id in results clustering

2016-02-18 Thread Dmitry Kan
Hi, Is it possible to change the id field, that defaults to 'id' in carrot based result clustering? I have another field, 'externalId', that is stamped on each document and would like to return it in clusters instead. -- Dmitry Kan Luke Toolbox: http://github.com/DmitryKey/luke Blog: http

[ANNOUNCE] Luke 5.4.0 released

2016-02-14 Thread Dmitry Kan
earlier, but not announced separately on this list: luke running on Apache Pivot instead of the Thinlet library. It supports lucene 5.2.1. Grab it here: https://github.com/DmitryKey/luke/releases/tag/pivot-luke-5.2.1 Your feedback is appreciated! -- Dmitry Kan Luke Toolbox: http://github.com/DmitryKey

Re: similarity as a parameter

2015-12-15 Thread Dmitry Kan
; You would need to define an alternate field which copied a base field but > then had the desired alternate similarity, using SchemaSimilarityFactory. > > See: > https://cwiki.apache.org/confluence/display/solr/Other+Schema+Elements > > > -- Jack Krupansky > > > On T

Re: similarity as a parameter

2015-12-15 Thread Dmitry Kan
a QParser that wraps the constructed query is going to be the > simplest/cleanest solution regardless of wether #1 or #2 makes the most > sense -- perhaps even achieving #2 by using #1 so that createWeight in > your new QueryWrapper class does the IndexSearcher wrapping before

similarity as a parameter

2015-12-15 Thread Dmitry Kan
Hi guys, Is there a way to alter the similarity class at runtime, with a parameter? -- Dmitry Kan Luke Toolbox: http://github.com/DmitryKey/luke Blog: http://dmitrykan.blogspot.com Twitter: http://twitter.com/dmitrykan SemanticAnalyzer: www.semanticanalyzer.info

ways to affect on SpanMultiTermQueryWrapper.TopTermsSpanBooleanQueryRewrite

2015-11-02 Thread Dmitry Kan
Hi solr fans, Are there ways to affect on strategy behind SpanMultiTermQueryWrapper.TopTermsSpanBooleanQueryRewrite ? As it seems, at the moment, the rewrite method loads max N words that maximize term score. How can this be changed to loading top terms by frequency, for example? -- Dmitry Kan

[ANNOUNCE] Luke 5.3.0 released

2015-09-28 Thread Dmitry Kan
, please file an issue on the luke's github: https://github.com/DmitryKey/luke Luke Team -- Dmitry Kan Luke Toolbox: http://github.com/DmitryKey/luke Blog: http://dmitrykan.blogspot.com Twitter: http://twitter.com/dmitrykan SemanticAnalyzer: www.semanticanalyzer.info

modular QueryParser in contrib

2015-09-21 Thread Dmitry Kan
modularity and customizability. Can you point to what the exact class is? -- Dmitry Kan Luke Toolbox: http://github.com/DmitryKey/luke Blog: http://dmitrykan.blogspot.com Twitter: http://twitter.com/dmitrykan SemanticAnalyzer: www.semanticanalyzer.info

Re: modular QueryParser in contrib

2015-09-21 Thread Dmitry Kan
s ago, but I > haven't > > noticed any real activity or interest in it. > > > > -- Jack Krupansky > > > > On Mon, Sep 21, 2015 at 6:36 AM, Dmitry Kan <solrexp...@gmail.com> > wrote: > > > >> Hello! > >> > >> Asked the q

Re: how to extend JavaBinCodec and make it available in solrj api

2015-08-17 Thread Dmitry Kan
shalinman...@gmail.com wrote: No, I'm afraid you will have to extend the XmlResponseWriter in that case. On Sat, Aug 8, 2015 at 2:02 PM, Dmitry Kan solrexp...@gmail.com wrote: Shalin, Thanks, can I also introduce custom entity tags like in my example with the highlighter output

Re: how to extend JavaBinCodec and make it available in solrj api

2015-08-08 Thread Dmitry Kan
the response writers. Instead, if you just used nested maps/lists or SimpleOrderedMap/NamedList then every response writer should be able to just directly write the output. Nesting is not a problem. On Fri, Aug 7, 2015 at 6:09 PM, Dmitry Kan solrexp...@gmail.com wrote: Shawn: thanks, we found

Re: how to extend JavaBinCodec and make it available in solrj api

2015-08-07 Thread Dmitry Kan
should be able to handle them. On Wed, Aug 5, 2015 at 5:08 PM, Dmitry Kan solrexp...@gmail.com wrote: Hello, Solr: 5.2.1 class: org.apache.solr.common.util.JavaBinCodec I'm working on a custom data structure for the highlighter. The data structure is ready in JSON and XML formats. I need

how to extend JavaBinCodec and make it available in solrj api

2015-08-05 Thread Dmitry Kan
framework such that JavaBinCodec is extended and used for the new data structure? -- Dmitry Kan Luke Toolbox: http://github.com/DmitryKey/luke Blog: http://dmitrykan.blogspot.com Twitter: http://twitter.com/dmitrykan SemanticAnalyzer: www.semanticanalyzer.info

[JOB] Financial search engine company AlphaSense is looking for Search Engineers

2015-08-03 Thread Dmitry Kan
your CV over and let's have a chat. Please e-mail me, if you have any questions. -- Dmitry Kan Luke Toolbox: http://github.com/DmitryKey/luke Blog: http://dmitrykan.blogspot.com Twitter: http://twitter.com/dmitrykan SemanticAnalyzer: www.semanticanalyzer.info

[JOB] Financial search engine company AlphaSense is looking for Search Engineers

2015-07-09 Thread Dmitry Kan
Revolution, ApacheCon, Berlin buzzwords), review books on Solr. Send your CV over and let's have a chat. -- Dmitry Kan Luke Toolbox: http://github.com/DmitryKey/luke Blog: http://dmitrykan.blogspot.com Twitter: http://twitter.com/dmitrykan SemanticAnalyzer: www.semanticanalyzer.info

[ANNOUNCE] Luke 5.2.0 released

2015-07-07 Thread Dmitry Kan
Lucene 5x support #28 https://github.com/DmitryKey/luke/pull/28 Added LUKE_PATH env variable to luke.sh #30 https://github.com/DmitryKey/luke/pull/30 Luke 5.2 -- Dmitry Kan Luke Toolbox: http://github.com/DmitryKey/luke Blog: http://dmitrykan.blogspot.com Twitter: http://twitter.com/dmitrykan

Re: issue with highlighting in solr 4.10.2

2015-06-29 Thread Dmitry Kan
the snippet size you've specified? Shot in the dark, Erick On Fri, Jun 26, 2015 at 3:22 AM, Dmitry Kan solrexp...@gmail.com wrote: Hi, When highlighting hits for the following query: (+Contents:apple +Contents:watch) Contents:iphone I expect the standard solr highlighter

issue with highlighting in solr 4.10.2

2015-06-26 Thread Dmitry Kan
feature? Is there any way to debug the highlighter using solr admin? -- Dmitry Kan Luke Toolbox: http://github.com/DmitryKey/luke Blog: http://dmitrykan.blogspot.com Twitter: http://twitter.com/dmitrykan SemanticAnalyzer: www.semanticanalyzer.info

Re: MappingCharFilterFactory and start and end offsets

2015-06-25 Thread Dmitry Kan
to the *original* text. You can work around this by performing the substitution prior to Solr analysis, e.g. in an update processor like RegexReplaceProcessorFactory. Steve www.lucidworks.com On Jun 18, 2015, at 3:07 AM, Dmitry Kan solrexp...@gmail.com wrote: Hi, It looks like

MappingCharFilterFactory and start and end offsets

2015-06-18 Thread Dmitry Kan
to have start and end offset respecting the remapped token. Can this be achieved with settings? -- Dmitry Kan Luke Toolbox: http://github.com/DmitryKey/luke Blog: http://dmitrykan.blogspot.com Twitter: http://twitter.com/dmitrykan SemanticAnalyzer: www.semanticanalyzer.info

Re: bug in search with sloppy queries

2015-06-15 Thread Dmitry Kan
); } } [/code] as query we get the above structure, from which all terms are extracted without keeping the query structure? Could someone shed light on the logic behind this weight calculation? On Mon, Jun 15, 2015 at 10:23 AM, Dmitry Kan solrexp...@gmail.com wrote: To clarify additionally: we use

Re: bug in search with sloppy queries

2015-06-15 Thread Dmitry Kan
token after analysis to see if my guess is accurate. Best, Erick On Sun, Jun 14, 2015 at 4:34 AM, Dmitry Kan solrexp...@gmail.com wrote: Hi guys, We observe some strange bug in solr 4.10.2, where by a sloppy query hits words it should not: lst name=debugstr name=rawquerystringthe e

Re: bug in search with sloppy queries

2015-06-15 Thread Dmitry Kan
To clarify additionally: we use StandardTokenizer StandardFilter in front of the WDF. Already following ST's transformations e-tail gets split into two consecutive tokens On Mon, Jun 15, 2015 at 10:08 AM, Dmitry Kan solrexp...@gmail.com wrote: Thanks, Erick. Analysis page shows the positions

bug in search with sloppy queries

2015-06-14 Thread Dmitry Kan
in that order. Can somebody shed light into what is going on? -- Dmitry Kan Luke Toolbox: http://github.com/DmitryKey/luke Blog: http://dmitrykan.blogspot.com Twitter: http://twitter.com/dmitrykan SemanticAnalyzer: www.semanticanalyzer.info

storeOffsetsWithPositions does not reflect in the index

2015-05-11 Thread Dmitry Kan
? -- Dmitry Kan Luke Toolbox: http://github.com/DmitryKey/luke Blog: http://dmitrykan.blogspot.com Twitter: http://twitter.com/dmitrykan SemanticAnalyzer: www.semanticanalyzer.info

Re: severe problems with soft and hard commits in a large index

2015-05-06 Thread Dmitry Kan
in context: http://lucene.472066.n3.nabble.com/severe-problems-with-soft-and-hard-commits-in-a-large-index-tp4204068.html Sent from the Solr - User mailing list archive at Nabble.com. -- Dmitry Kan Luke Toolbox: http://github.com/DmitryKey/luke Blog: http://dmitrykan.blogspot.com Twitter

Re: Proximity Search

2015-04-30 Thread Dmitry Kan
delete it from your system immediately and notify us either by e-mail or telephone. You should not copy, forward or otherwise disclose the content of the e-mail. The views expressed in this communication may not necessarily be the view held by WHISHWORKS. -- Dmitry Kan Luke Toolbox

Re: payload similarity

2015-04-25 Thread Dmitry Kan
: http://lucidworks.com/blog/end-to-end-payload-example-in-solr/ Best, Erick On Fri, Apr 24, 2015 at 6:33 AM, Dmitry Kan solrexp...@gmail.com wrote: Ahmet, exactly. As I have just illustrated with code, simultaneously with your reply. Thanks! On Fri, Apr 24, 2015 at 4:30 PM, Ahmet

payload similarity

2015-04-24 Thread Dmitry Kan
Term(body, dogs)); termQuery.setBoost(1.1f); TopDocs topDocs = searcher.search(termQuery, 10); printResults(searcher, termQuery, topDocs); [/code] -- Dmitry Kan Luke Toolbox: http://github.com/DmitryKey/luke Blog: http://dmitrykan.blogspot.com Twitter: http://twitter.com/dmitrykan

Re: payload similarity

2015-04-24 Thread Dmitry Kan
() On Fri, Apr 24, 2015 at 2:50 PM, Dmitry Kan solrexp...@gmail.com wrote: Hi, Using the approach here http://lucidworks.com/blog/getting-started-with-payloads/ I have implemented my own PayloadSimilarity class. When debugging the code I have noticed, that the scorePayload method is never

Re: payload similarity

2015-04-24 Thread Dmitry Kan
Ahmet, exactly. As I have just illustrated with code, simultaneously with your reply. Thanks! On Fri, Apr 24, 2015 at 4:30 PM, Ahmet Arslan iori...@yahoo.com.invalid wrote: Hi Dmitry, I think, it is activated by PayloadTermQuery. Ahmet On Friday, April 24, 2015 2:51 PM, Dmitry Kan

Re: Odp.: phraseFreq vs sloppyFreq

2015-04-22 Thread Dmitry Kan
proximity 1k? @LAFK_PL Oryginalna wiadomość Od: Dmitry Kan Wysłano: środa, 22 kwietnia 2015 09:26 Do: solr-user@lucene.apache.org Odpowiedz: solr-user@lucene.apache.org Temat: phraseFreq vs sloppyFreq Hi guys. I'm executing the following proximity query: leader the~1000. In the debugQuery

phraseFreq vs sloppyFreq

2015-04-22 Thread Dmitry Kan
increase the final similarity score? -- Dmitry Kan Luke Toolbox: http://github.com/DmitryKey/luke Blog: http://dmitrykan.blogspot.com Twitter: http://twitter.com/dmitrykan SemanticAnalyzer: www.semanticanalyzer.info

[ANNOUNCE] Luke 4.10.4 released

2015-03-16 Thread Dmitry Kan
is now distributed as a tar.gz with the luke binary and a launcher script. There is currently luke atop apache pivot cooking in its own branch. You can try it out already for some basic index loading and search operations: https://github.com/DmitryKey/luke/tree/pivot-luke -- Dmitry Kan Luke Toolbox

Re: solr 4.7.2 mergeFactor/ Merge policy issue

2015-03-16 Thread Dmitry Kan
believe that new segments are created when the indexing buffer (ramBufferSizeMB) fills up, even without commits. I'm pretty sure that anytime a new segment is created, the merge policy is checked to see whether a merge is needed. Thanks, Shawn -- Dmitry Kan Luke Toolbox: http

Re: [Poll]: User need for Solr security

2015-03-13 Thread Dmitry Kan
runs etc. Any but trivial encryption will break that, and the trivial encryption is easy to break. So putting all this over an encrypting filesystem is an approach that's often used. FWIW On Thu, Mar 12, 2015 at 5:22 AM, Dmitry Kan solrexp...@gmail.com wrote: Hi, Things you have

Re: [Poll]: User need for Solr security

2015-03-13 Thread Dmitry Kan
. Any but trivial encryption will break that, and the trivial encryption is easy to break. So putting all this over an encrypting filesystem is an approach that's often used. FWIW On Thu, Mar 12, 2015 at 5:22 AM, Dmitry Kan solrexp...@gmail.com wrote: Hi, Things you have

Re: [Poll]: User need for Solr security

2015-03-12 Thread Dmitry Kan
. Examples: Local user management, AD/LDAP integration, SSL, authenticated login to Admin UI, authorization for Admin APIs, e.g. admin user vs read-only user etc -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com -- Dmitry Kan Luke Toolbox: http://github.com/DmitryKey/luke

Re: Missing doc fields

2015-03-12 Thread Dmitry Kan
multiValued=false / field name=ymd type=tdate indexed=true stored=true/ --- - Mail original - De: Dmitry Kan solrexp...@gmail.com À: solr-user@lucene.apache.org Envoyé: Mercredi 11 Mars 2015 11:38:26 Objet: Re

Re: DocumentAnalysisRequestHandler

2015-03-12 Thread Dmitry Kan
? requestHandler name=/analysis/document class=solr.DocumentAnalysisRequestHandler startup=lazy / What is the modern equivalent of Luke? Many thanks. Philippe -- Dmitry Kan Luke Toolbox: http://github.com/DmitryKey/luke Blog: http://dmitrykan.blogspot.com

Re: Missing doc fields

2015-03-11 Thread Dmitry Kan
template, documents only contain three fields (id, _version_, score): SolrDocument{id=3, _version_=1495262517955395584, score=1.0}, How can I increase the number of doc fields? Many thanks. Philipppe -- Dmitry Kan Luke Toolbox: http://github.com/DmitryKey/luke Blog: http

Re: unusually high 4.10.2 vs 4.3.1 RAM consumption

2015-03-10 Thread Dmitry Kan
This freed up couple dozen GBs on the solr server! On Tue, Feb 17, 2015 at 1:47 PM, Dmitry Kan solrexp...@gmail.com wrote: Thanks Toke! Now I consistently see the saw-tooth pattern on two shards with new GC parameters, next I will try your suggestion. The current params are: -Xmx25600m

Re: Conditional invocation of HTMLStripCharFactory

2015-03-02 Thread Dmitry Kan
://lucene.472066.n3.nabble.com/Conditional-invocation-of-HTMLStripCharFactory-tp4190010.html Sent from the Solr - User mailing list archive at Nabble.com. -- Dmitry Kan Luke Toolbox: http://github.com/DmitryKey/luke Blog: http://dmitrykan.blogspot.com Twitter: http://twitter.com/dmitrykan

Re: [ANNOUNCE] Luke 4.10.3 released

2015-03-01 Thread Dmitry Kan
suggested. Thanks, Tomoko 2015-02-26 22:15 GMT+09:00 Dmitry Kan solrexp...@gmail.com: Sure, it is: java version 1.7.0_76 Java(TM) SE Runtime Environment (build 1.7.0_76-b13) Java HotSpot(TM) 64-Bit Server VM (build 24.76-b04, mixed mode) On Thu, Feb 26, 2015 at 2:39 PM, Tomoko Uchida

Re: [ANNOUNCE] Luke 4.10.3 released

2015-02-26 Thread Dmitry Kan
launch. Seems something wrong around Pivot's, but I have no idea about it. Would you tell me java version you're using ? Tomoko 2015-02-26 21:15 GMT+09:00 Dmitry Kan solrexp...@gmail.com: Thanks, Tomoko, it compiles ok! Now launching produces some errors: $ java -cp dist

Re: [ANNOUNCE] Luke 4.10.3 released

2015-02-26 Thread Dmitry Kan
2015-02-26 16:39 GMT+09:00 Dmitry Kan solrexp...@gmail.com: Hi Tomoko, Thanks for the link. Do you have build instructions somewhere? When I executed ant with no params, I get: BUILD FAILED /home/dmitry/projects/svn/luke/build.xml:40: /home/dmitry/projects/svn/luke/lib-ivy does

Re: [ANNOUNCE] Luke 4.10.3 released

2015-02-25 Thread Dmitry Kan
2015-02-25 18:37 GMT+09:00 Dmitry Kan solrexp...@gmail.com: Ok, sure. The plan is to make the pivot branch in the current github repo and update its structure accordingly. Once it is there, I'll let you know. Thank you, Dmitry On Tue, Feb 24, 2015 at 5:26 PM, Tomoko Uchida

Re: [ANNOUNCE] Luke 4.10.3 released

2015-02-25 Thread Dmitry Kan
, Tomoko 2015-02-24 23:34 GMT+09:00 Dmitry Kan solrexp...@gmail.com: Hi, Tomoko! Thanks for being a fan of luke! Current status of github's luke (https://github.com/DmitryKey/luke) is that it has releases for all the major lucene versions since 4.3.0, excluding 4.4.0 (luke 4.5.0 should

Re: highlighting the boolean query

2015-02-25 Thread Dmitry Kan
Architect http://www.lucidworks.com http://www.lucidworks.com/ On Feb 24, 2015, at 3:16 AM, Dmitry Kan solrexp...@gmail.com wrote: Erick, Our default operator is AND. Both queries below parse the same: a OR (b c) OR d a OR (b AND c) OR d The parsed query: str name

Re: highlighting the boolean query

2015-02-24 Thread Dmitry Kan
highlighters are better for this case.. no clue ;( Best, Erick On Mon, Feb 23, 2015 at 9:36 AM, Dmitry Kan solrexp...@gmail.com wrote: Erick, nope, we are using std lucene qparser with some customizations, that do not affect the boolean query parsing logic. Should we try some other

Re: Integration Tests with SOLR 5

2015-02-24 Thread Dmitry Kan
, solr plug-ins etc. for testing in an isolated environment - What does a maven boilerplate code look like? Any ideas would be appreciated. Kind regards, Thomas -- Dmitry Kan Luke Toolbox: http://github.com/DmitryKey/luke Blog: http://dmitrykan.blogspot.com Twitter: http://twitter.com

Re: [ANNOUNCE] Luke 4.10.3 released

2015-02-24 Thread Dmitry Kan
for a bit annoying post. Many thanks, Tomoko 2015-02-24 0:00 GMT+09:00 Dmitry Kan solrexp...@gmail.com: Hello, Luke 4.10.3 has been released. Download it here: https://github.com/DmitryKey/luke/releases/tag/luke-4.10.3 The release has been tested against the solr-4.10.3 based index

highlighting the boolean query

2015-02-23 Thread Dmitry Kan
of the standard highlighter? Can it be mitigated? -- Dmitry Kan Luke Toolbox: http://github.com/DmitryKey/luke Blog: http://dmitrykan.blogspot.com Twitter: http://twitter.com/dmitrykan SemanticAnalyzer: www.semanticanalyzer.info

[ANNOUNCE] Luke 4.10.3 released

2015-02-23 Thread Dmitry Kan
changed from ASL 2.0 to ALv2 Thanks to respective contributors! P.S. waiting for lucene 5.0 artifacts to hit public maven repositories for the next major release of luke. -- Dmitry Kan Luke Toolbox: http://github.com/DmitryKey/luke Blog: http://dmitrykan.blogspot.com Twitter: http://twitter.com

Re: highlighting the boolean query

2015-02-23 Thread Dmitry Kan
AM, Dmitry Kan solrexp...@gmail.com wrote: Hello! In solr 4.3.1 there seem to be some inconsistency with the highlighting of the boolean query: a OR (b c) OR d This returns a proper hit, which shows that only d was included into the document score calculation

Re: Internal document format for Solr 4.10.2

2015-02-18 Thread Dmitry Kan
to store this internal document in xml format ? -- Best Regards, Dinesh Naik -- Dmitry Kan Luke Toolbox: http://github.com/DmitryKey/luke Blog: http://dmitrykan.blogspot.com Twitter: http://twitter.com/dmitrykan SemanticAnalyzer: www.semanticanalyzer.info

Re: unusually high 4.10.2 vs 4.3.1 RAM consumption

2015-02-17 Thread Dmitry Kan
is that the 4.10.2 shard reserves 8x times it uses. What can be done about this? -- Dmitry Kan Luke Toolbox: http://github.com/DmitryKey/luke Blog: http://dmitrykan.blogspot.com Twitter: http://twitter.com/dmitrykan SemanticAnalyzer: www.semanticanalyzer.info

Re: Weird Solr Replication Slave out of sync

2015-02-17 Thread Dmitry Kan
to find especially when there are on errors. What could be happening. and how can I avoid this from happening ? Thanks, Summer -- Dmitry Kan Luke Toolbox: http://github.com/DmitryKey/luke Blog: http://dmitrykan.blogspot.com Twitter: http://twitter.com/dmitrykan SemanticAnalyzer

unusually high 4.10.2 vs 4.3.1 RAM consumption

2015-02-17 Thread Dmitry Kan
indexing. What else could be the artifact of such a difference -- Solr or JVM? Can it only be explained by the mass indexing? What is worrisome is that the 4.10.2 shard reserves 8x times it uses. What can be done about this? -- Dmitry Kan Luke Toolbox: http://github.com/DmitryKey/luke Blog: http

Re: unusually high 4.10.2 vs 4.3.1 RAM consumption

2015-02-17 Thread Dmitry Kan
reserves 8x times it uses. What can be done about this? -- Dmitry Kan Luke Toolbox: http://github.com/DmitryKey/luke Blog: http://dmitrykan.blogspot.com Twitter: http://twitter.com/dmitrykan SemanticAnalyzer: www.semanticanalyzer.info -- Dmitry Kan Luke Toolbox: http

Re: unusually high 4.10.2 vs 4.3.1 RAM consumption

2015-02-17 Thread Dmitry Kan
-XX:CMSInitiatingOccupancyFraction=40 Dmitry On Tue, Feb 17, 2015 at 1:34 PM, Toke Eskildsen t...@statsbiblioteket.dk wrote: On Tue, 2015-02-17 at 11:05 +0100, Dmitry Kan wrote: Solr: 4.10.2 (high load, mass indexing) Java: 1.7.0_76 (Oracle) -Xmx25600m Solr: 4.3.1 (normal load, no mass indexing) Java: 1.7.0_11

Re: ApacheCon 2015 at Austin, TX

2015-02-12 Thread Dmitry Kan
if there will be lucene/solr sessions in it. Anyone else planning to attend? Thanks, CP -- Dmitry Kan Luke Toolbox: http://github.com/DmitryKey/luke Blog: http://dmitrykan.blogspot.com Twitter: http://twitter.com/dmitrykan SemanticAnalyzer: www.semanticanalyzer.info

Re: WordDelimiterFilterFactory and position increment.

2015-02-04 Thread Dmitry Kan
WordDelimiterFilter on query side. Regards, Modassar On Fri, Jan 30, 2015 at 5:12 PM, Dmitry Kan solrexp...@gmail.com wrote: Hi, Do you use WordDelimiterFilter on query side as well? On Fri, Jan 30, 2015 at 12:51 PM, Modassar Ather modather1...@gmail.com wrote: Hi

How deletes affect on QPS

2015-01-31 Thread Dmitry Kan
the post is on Lucene level): https://www.elasticsearch.org/blog/lucenes-handling-of-deleted-documents/ -- Dmitry Kan Luke Toolbox: http://github.com/DmitryKey/luke Blog: http://dmitrykan.blogspot.com Twitter: http://twitter.com/dmitrykan SemanticAnalyzer: www.semanticanalyzer.info

Re: solrj returning no results but curl can get them

2015-01-31 Thread Dmitry Kan
-tp4183053p4183119.html Sent from the Solr - User mailing list archive at Nabble.com. -- Dmitry Kan Luke Toolbox: http://github.com/DmitryKey/luke Blog: http://dmitrykan.blogspot.com Twitter: http://twitter.com/dmitrykan SemanticAnalyzer: www.semanticanalyzer.info

Re: solrj returning no results but curl can get them

2015-01-30 Thread Dmitry Kan
-but-curl-can-get-them-tp4183053.html Sent from the Solr - User mailing list archive at Nabble.com. -- Dmitry Kan Luke Toolbox: http://github.com/DmitryKey/luke Blog: http://dmitrykan.blogspot.com Twitter: http://twitter.com/dmitrykan SemanticAnalyzer: www.semanticanalyzer.info

Re: WordDelimiterFilterFactory and position increment.

2015-01-30 Thread Dmitry Kan
queries starting with token which is tokenized as shown above in the table. Kindly help me understand the behavior and let me know how the phrase search is possible in such cases without the slop. Thanks, Modassar -- Dmitry Kan Luke Toolbox: http://github.com/DmitryKey/luke

Re: SOS-help: How to store solr index data in hbase table???

2015-01-26 Thread Dmitry Kan
would do that. You *can* store your indexes in HDFS storage, but that's not the same thing. https://cwiki.apache.org/confluence/display/solr/Running+Solr+on+HDFS I have never done this, so I have no idea whether this documentation is complete. Thanks, Shawn -- Dmitry Kan Luke Toolbox

groups inside groups

2015-01-15 Thread Dmitry Kan
,'docs'=[ { '255'=2}, { '3042'=3}, { '3428'=1}, { '68'=4}] }}} -- Dmitry Kan Luke Toolbox: http://github.com/DmitryKey/luke Blog: http://dmitrykan.blogspot.com Twitter: http://twitter.com/dmitrykan

Re: SegmentInfos exposed to /admin/luke

2014-12-08 Thread Dmitry Kan
-- Dmitry Kan Luke Toolbox: http://github.com/DmitryKey/luke Blog: http://dmitrykan.blogspot.com Twitter: http://twitter.com/dmitrykan SemanticAnalyzer: www.semanticanalyzer.info

dynamically change default update chain

2014-11-03 Thread Dmitry Kan
. no parameter change needed. Is this possible with the current state of the Solr core / collection api or some other method? -- Dmitry Kan Blog: http://dmitrykan.blogspot.com Twitter: http://twitter.com/dmitrykan SemanticAnalyzer: www.semanticanalyzer.info

Re: dynamically change default update chain

2014-11-03 Thread Dmitry Kan
, Dmitry Kan wrote: Hello solr fellows, I'm working on a project that involves using two update chains. One default chain is used most of the time and another one custom is used sporadically. The default update chain is called automatically without action needed (well, that's why it is default

Re: dynamically change default update chain

2014-11-03 Thread Dmitry Kan
An update: Another idea comes from Erick Hatcher; sharing it for the benefit of anyone who's interested in the topic: erikhatcher maybe you can make a custom request handler that toggles which is the default chain? On Mon, Nov 3, 2014 at 4:08 PM, Dmitry Kan solrexp...@gmail.com wrote

  1   2   3   4   5   6   >