Should the Solr release notes reference the additional fixes that went in
there?
>From your email to start the thread:
- SOLR-8496: multi-select faceting and getDocSet(List) can match
deleted docs
- SOLR-8418: Adapt to changes in LUCENE-6590 for use of boosts with
MLTHandler and
I am currently working on migrating a project from an old version of Solr
to Elasticsearch, and came across a funny (to me at least) difference in
the "default" behavior of JapanesePartOfSpeechStopFilterFactory.
If JapanesePartOfSpeechStopFilterFactory is given empty args, it does
nothing. It
't fix it in 8.x releases... not sure).
>
> Mike McCandless
>
> http://blog.mikemccandless.com
>
>
> On Fri, Oct 2, 2020 at 12:10 PM Michael Froh wrote:
>
>> I am currently working on migrating a project from an old version of Solr
>> to Elasticsearch, and cam
+1 (Non-binding)
Upgraded Amazon Product Search to this RC and found no issues.
On Fri, Jul 10, 2020 at 5:03 AM Namgyu Kim wrote:
> +1 SUCCESS! [1:25:53.314724]
>
> On Fri, Jul 10, 2020 at 2:22 PM Tomás Fernández Löbbe <
> tomasflo...@gmail.com> wrote:
>
>> +1
>>
>> SUCCESS! [1:04:02.550893]
ES/Solr
> layer (which I know you don't use, but hypothetically speaking), I'm
> dubious there as well.
> >>
> >> ~ David Smiley
> >> Apache Lucene/Solr Search Developer
> >> http://www.linkedin.com/in/davidwsmiley
> >>
> >>
> >>
My team at work has a neat feature that we've built on top of Lucene that
has provided a substantial (20%+) increase in maximum qps and some
reduction in query latency.
Basically, we run a training process that looks at historical queries to
find frequently co-occurring combinations of required
the index is smaller? Now that we have
> > ConditionalTokenFilter (for branching), can the feature be implemented
> > cleanly?
> >
> > Ideally it wouldn't require a lot of new code, something like checking
> > a "set" + conditionaltokenfilter + shinglefilter?
> >
doing this for
> non-scoring cases maybe something is off?
>
> On Tue, Dec 15, 2020 at 3:19 PM Michael Froh wrote:
> >
> > It's conceptually similar to CommonGrams in the single-field case,
> though it doesn't require terms to appear in any particular positions.
> >
I have some code that is kind of abusing IndexWriter.deleteAll(). In short,
I'm basically experimenting with using tiny (one block of joined
parent/child documents) indexes as a serialized format to index on one
fleet and then merge these tiny indexes on another fleet. I'm doing this by
indexing a
IndexWriter instances.
On Wed, Nov 18, 2020 at 12:25 PM Michael Sokolov wrote:
> I'm curious if you tried creating a new IndexWriter for each batch?
>
> On Wed, Nov 18, 2020 at 1:18 PM Michael Froh wrote:
> >
> > I have some code that is kind of abusing IndexWriter.deleteA
r
> http://www.linkedin.com/in/davidwsmiley
>
>
> On Wed, Nov 18, 2020 at 1:17 PM Michael Froh wrote:
>
>> I have some code that is kind of abusing IndexWriter.deleteAll(). In
>> short, I'm basically experimenting with using tiny (one block of joined
>> parent/chi
We recently added multimodal search in OpenSearch:
https://github.com/opensearch-project/neural-search/pull/359
Since Lucene ultimately just cares about embeddings, does Lucene itself
really need to be multimodal? Wherever the embeddings come from, Lucene can
index the vectors and combine with
Hey,
I've been musing about ideas for a "clever" Boolean field type on Lucene
for a while, and I think I might have an idea that could work. That said,
this popped into my head this afternoon and has not been fully-baked. It
may not be very clever at all.
My experience is that Boolean fields
the posting list goes like
>> dense sequentially increasing numbers 1,2,3,4,5.. May it already be
>> compressed by codecs like
>> https://lucene.apache.org/core/9_2_0/core/org/apache/lucene/util/packed/MonotonicBlockPackedWriter.html
>> ?
>>
>> On Thu, Nov 9, 2023 at 3:31 AM Mic
Hi,
I was recently poking around in the createWeight implementation for
MultiTermQueryConstantScoreWrapper to get to the bottom of some slow
queries, and I realized that the worst-case performance could be pretty
bad, but (maybe) possible to optimize for.
Imagine if we have a segment with N docs
Hi all,
I was looking into a customer issue where they noticed some increased GC
time after upgrading from Lucene 7.x to 9.x. After taking some heap dumps
from both systems, the big difference was tracked down to the float[256]
allocated (as a norms cache) when creating a BM25Scorer (in
paring heap dumps from
production hosts so far, so I'll try measuring in an environment where I
can see what's going on.
On Tue, May 2, 2023 at 1:14 PM Robert Muir wrote:
> On Tue, May 2, 2023 at 3:24 PM Michael Froh wrote:
> >
> > > This seems ok if it isn't invasive. I s
ions in the demoscene.
I could try inlining those calculations and measuring the impact with the
luceneutil benchmarks.
On Tue, May 2, 2023 at 11:34 AM Robert Muir wrote:
> On Tue, May 2, 2023 at 12:49 PM Michael Froh wrote:
> >
> > Hi all,
> >
> > I was looking
Hi there,
I was recently writing up a short Lucene file format tutorial (
https://msfroh.github.io/lucene-university/docs/DirectoryFileContents.html),
using SimpleTextCodec for educational purposes.
I found that SimpleTextSegmentInfo tries to output the segment ID as raw
bytes, which will often
Hi,
On OpenSearch, we've been taking advantage of the various O(1)
Weight#count() implementations to quickly compute various aggregations
without needing to iterate over all the matching documents (at least when
the top-level query is functionally a match-all at the segment level). Of
course,
ration on the bit set?
>
> I don't think we can fold it into Weight#count since there is an
> expectation that it is negligible compared with the cost of a naive count,
> but we may be able to do it in IndexSearcher#count or on the OpenSearch
> side.
>
> Le ven. 2 févr. 20
Is your new test uncommitted?
The Gradle check will fail if you have uncommitted files, to avoid the
situation where it "works on my machine (because of a file that I forgot to
commit)".
The rough workflow is:
1. Develop stuff (code and/or tests).
2. Commit it.
3. Gradle check.
4. If Gradle
Hi Anand,
Interesting that you should bring this up!
There was a talk just this week at Berlin Buzzwords talking about using
cuVS with Lucene: https://www.youtube.com/watch?v=qiW7iIDFJC0
>From that talk, it sounds like the folks at SearchScale have managed to
integrate cuVS as a custom codec
Michael Froh created SOLR-3526:
--
Summary: Remove classfile dependency on ZooKeeper from
CoreContainer
Key: SOLR-3526
URL: https://issues.apache.org/jira/browse/SOLR-3526
Project: Solr
Issue
[
https://issues.apache.org/jira/browse/SOLR-3526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13292689#comment-13292689
]
Michael Froh commented on SOLR-3526:
Oh, thanks a lot for pointing that out, Hoss! I
Michael Froh created LUCENE-4185:
Summary: CharFilters being added twice in Solr
Key: LUCENE-4185
URL: https://issues.apache.org/jira/browse/LUCENE-4185
Project: Lucene - Java
Issue Type
[
https://issues.apache.org/jira/browse/LUCENE-4185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Michael Froh updated LUCENE-4185:
-
Affects Version/s: (was: 4.0)
4.0-ALPHA
CharFilters being added
Michael Froh created SOLR-5330:
--
Summary: PerSegmentSingleValuedFaceting overwrites facet values
Key: SOLR-5330
URL: https://issues.apache.org/jira/browse/SOLR-5330
Project: Solr
Issue Type
[
https://issues.apache.org/jira/browse/SOLR-5330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Michael Froh updated SOLR-5330:
---
Attachment: solr-5330.patch
Patch attached
PerSegmentSingleValuedFaceting overwrites facet values
[
https://issues.apache.org/jira/browse/SOLR-3526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15014134#comment-15014134
]
Michael Froh commented on SOLR-3526:
3.5 years later, I decided to try taking a stab at this myself
[
https://issues.apache.org/jira/browse/SOLR-3526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15014134#comment-15014134
]
Michael Froh edited comment on SOLR-3526 at 11/19/15 6:53 PM:
--
3.5 years later
[
https://issues.apache.org/jira/browse/SOLR-3526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15014167#comment-15014167
]
Michael Froh commented on SOLR-3526:
Also worth highlighting -- the significant part of the change
32 matches
Mail list logo