Re: Java 17 and Lucene

2021-10-20 Thread Jigar Shah
Michael, Is this recommended "-XX:+UseZGC options to enable ZGC." as it claims very low pauses. For "*DY* (2021-10-19 08:14:33): Upgrade to JDK17+35" execution for "Indexing throughput " is ZGC used for the "Indexing throughput

Re: MMapDirectory vs In Memory Lucene Index (i.e., ByteBuffersDirectory)

2020-12-14 Thread Jigar Shah
Thanks, Uwe Yes, recommended, tmpfs/ramfs worked like a charm in our use-case with a read-only index, giving us very high-throughput and consistent response time on queries. We had to have some redundancy to be built around that service to be high-available, so we can do a rolling update on the

Re: MMapDirectory vs In Memory Lucene Index (i.e., ByteBuffersDirectory)

2020-12-14 Thread Jigar Shah
I used one of the Linux feature (ramfs, basically mounting ram on a partition) to guarantee that it's always in ram (No accidental paging ;) cost too). https://www.jamescoyle.net/how-to/943-create-a-ram-disk-in-linux WARN: Only use if it's a read-only index and can fit in ram and have a back-up

Re: Deduplication of search result with custom with custom sort

2020-10-09 Thread Jigar Shah
My learnings dealing this problem We faced a similar problem before, and did the following things: 1) Don't request totalGroupCount, and the response was fast. as computing group count is an expensive task. If you can live without groupCount. Although you can approximate pagination up to total

Re: Lucene one to many query

2019-09-21 Thread Jigar Shah
Nested documents structure supported by solr is what you need. But as you are using lucene, you should denormalize and store item with company fields and price. Apply search on item with function query on item_price. As you have results you can store companies in a set. On Sat, Sep 21, 2019,

Re: Nested Facets

2019-08-30 Thread Jigar Shah
You should be looking at Facet pivots feature what Solr provides based on doc-values. As you are using core Lucene you may have to do little more search on how to do at low level with Lucene Facet API based on DocValues facet. Your starting point should be

Re: SearcherTaxonomyManager Refreshing

2017-08-24 Thread Jigar Shah
Looks like your approach to manage main index and taxonomy index is risky. Main index keeps ordinals of taxonomy index. if you replace directories then taxo reader might have ordinals off-sync from main index. One fact about taxonomy index is on deletes or cleanup of main index, dosen't affect

Re: Lucene 6.6: "Too many open files"

2017-07-31 Thread Jigar Shah
I faced such problem when I used nomergepolicy, and did some code to manual merging segments which had bug and I had same issue. Make sure you have default AFAIR ConcurrentMergeStrategy enabled. And its is configured with appropriate settings. On Jul 31, 2017 11:21 PM, "Erick Erickson"

ProximityQueryNode dosen't allow distance parameter to be 0

2016-10-31 Thread Jigar Shah
In some cases where tokens are indexed at same position. e.g. using (synonym filter). Queryparser Flexible API dosen't allow to create ProximityQueryNode with distance '0'. {code} if (type == Type.NUMBER) { if (distance <= 0) { throw new QueryNodeError(new MessageImpl(

Re: NOT Operator with Parenthesis

2015-10-28 Thread Jigar Shah
LUCENE-6249 <https://issues.apache.org/jira/browse/LUCENE-6249> and LUCENE-6857 <https://issues.apache.org/jira/browse/LUCENE-6857> will be back-ported to 4.10.5. You may not need to jump to 5.X version for this. Thanks, Jigar Shah. On Wed, Oct 28, 2015 at 5:19 AM, patel mrugesh

Re: NOT Operator with Parenthesis

2015-10-27 Thread Jigar Shah
Most probably LUCENE-6249 changes parser's behavior, for your case. On Tue, Oct 27, 2015 at 5:33 AM, patel mrugesh wrote: > Thanks for your reply Erick, > I have

Re: Taxonomy index and payload

2015-05-12 Thread Jigar Shah
Check Facet Associations section in this video. https://www.youtube.com/watch?v=-CNZxkAMcKk On Tue, May 12, 2015 at 4:15 AM, Federico Tolomei f...@s17t.net wrote: Hello, is it possible to add a payload within the facet in the taxonomy index ? Thank you -- https://s17t.net f...@s17t.net

Matched docIds for each facet value

2015-05-08 Thread Jigar Shah
Hello, Is it possible to get matched docIds for each facet value. As to current we only get count. Let me know the classes internal to Lucene which i should look at just in case if its not exposed in API. Thanks, Jigar Shah.

Re: Top 10 words

2015-02-13 Thread Jigar Shah
If those are the known fields in the documents, you may extract words while indexing and create facets. Lucene supports faceted search which can give you Top n counts of such fields, which is much more efficient. Another option is apply clustering algorithm on results which can provide Top n

Re: Proximity query

2015-02-12 Thread Jigar Shah
This concept is called Proximity Search in general. In Lucene they are achieved using SpanQuery. On Thu, Feb 12, 2015 at 10:10 PM, Maisnam Ns maisnam...@gmail.com wrote: Hi, Can someone help me if this use case is possible or not with lucene Use case: I have a string say 'Japan' appearing

Re: Faceted Search Hierarchy

2015-01-08 Thread Jigar Shah
for top children, you will get Asia + India, both with a count of 1. Shai On Thu, Jan 8, 2015 at 1:48 PM, Jigar Shah jigaronl...@gmail.com wrote: Very simple question, on facet Index has 2 documents as follows: Doc1 Indexed facet path: Asia/India Doc2 Indexed facet path: India

Re: Faceted Search Hierarchy

2015-01-08 Thread Jigar Shah
/India, we cannot go back to the other document and update the hierarchy. On Thu, Jan 8, 2015 at 3:27 PM, Jigar Shah jigaronl...@gmail.com wrote: Is there some way to achieve this at Lucene level. so i can get facet like below ? Doc1: Asia + Asia/India Doc2: India + Asia/India/Gujarat

Re: Exception from FastTaxonomyFacetCounts

2014-10-15 Thread Jigar Shah
and DirectoryTaxonomyReader. Shai On Mon, Oct 13, 2014 at 12:15 PM, Jigar Shah jigaronl...@gmail.com wrote: In my application i have two intances of SearcherManager. 1) SearcherManager with 'applyAllDeletes = true' which is used by Indexer. (Works in NRT mode, deletes should be visible to it, also i have

lucene-facet-4.10.1 version not changed in 'DirectoryTaxonomyWriter'

2014-10-15 Thread Jigar Shah
Hello Lucene commiters, I saw one inconcistent version usage lucene-facet-4.10.1.jar. lucene-facet-4.10.1.jar uses deprecated 'Version.LUCENE_4_10_0 in class 'DirectoryTaxonomyWriter' 'createIndexWriterConfig' Ignore it if it is deliberate. Thanks,

Re: Exception from FastTaxonomyFacetCounts

2014-10-13 Thread Jigar Shah
AM, Jigar Shah jigaronl...@gmail.com wrote: Intermittently while search i am getting this exception on huge index. (FacetsConfig used while indexing and searching is same.) java.lang.ArrayIndexOutOfBoundsException: 252554 06:28:37,954 ERROR [stderr

Getting min/max of numeric doc-values facets

2014-10-09 Thread Jigar Shah
Is there some way when faceted search is executed, we can retrieve the possible min/max values of numeric doc-values field with supplied custom ranges in (LongRangeFacetCounts) or some other way to do it ? As i believe this can give application hint, and next search request can be much smarter,

Exception from FastTaxonomyFacetCounts

2014-10-07 Thread Jigar Shah
] at com.company.search.CustomDrillSideways.buildFacetsResult(LuceneDrillSideways.java:41) 06:28:37,954 ERROR [stderr] at org.apache.lucene.facet.DrillSideways.search(DrillSideways.java:146) 06:28:37,955 ERROR [stderr] at org.apache.lucene.facet.DrillSideways.search(DrillSideways.java:203) Thanks, Jigar Shah

FacetsConfig usage

2014-10-05 Thread Jigar Shah
? Thanks, Jigar Shah

Re: DrillSideways accepting FacetCollector parameter

2014-07-14 Thread Jigar Shah
the buildFacetResult method? That method gets the drill down and all sideways collectors... Mike McCandless http://blog.mikemccandless.com On Wed, Jul 9, 2014 at 1:40 AM, Jigar Shah jigaronl...@gmail.com wrote: Usecase: With below code i perform search. DrillSideways drillSideWays = new

DrillSideways accepting FacetCollector parameter

2014-07-08 Thread Jigar Shah
, i.e. non sideways facets. Thanks, Jigar Shah.

Re: DrillSideways accepting FacetCollector parameter

2014-07-08 Thread Jigar Shah
. This is not true in case of DrillSideways. Let me know if, there is already some other way provided. Thanks, Jigar Shah. On Tue, Jul 8, 2014 at 8:15 PM, Michael McCandless luc...@mikemccandless.com wrote: We could do this, but what's the use case? E.g. DrillSideways also hardwires the drill-sideways

Re: DocIDs from Facet Results

2014-07-07 Thread Jigar Shah
I think, you need to execute DrilDownQuery to get the docIds. On Mon, Jul 7, 2014 at 4:40 PM, Sandeep Khanzode sandeep_khanz...@yahoo.com.invalid wrote: Hi, For Lucene 4.7.2 Facets, once we invoke FacetCollector and get the topNChildren into FacetResult, is there any mechanism that for a

Re: Searching on Large Indexes

2014-06-27 Thread Jigar Shah
Some points based on my experience. You can think of SolrCloud implementation, if you want to distribute your index over multiple servers. Use MMapDirectory locally for each Solr instance in cluster. Hit warm-up query on sever start-up. So most of the documents will be cached, you will start

Re: Lucene Facets Module 4.8.1

2014-06-23 Thread Jigar Shah
'config.setIndexFieldName(CITY, city)' at index time and see if the exception still happens? Mike McCandless http://blog.mikemccandless.com On Sat, Jun 21, 2014 at 1:08 AM, Jigar Shah jigaronl...@gmail.com wrote: Thanks for helping me. Yes, i did couple of things: Below is simple code

Re: Lucene Facets Module 4.8.1

2014-06-23 Thread Jigar Shah
'config.setIndexFieldName(CITY, city)' at index time and see if the exception still happens? Mike McCandless http://blog.mikemccandless.com On Sat, Jun 21, 2014 at 1:08 AM, Jigar Shah jigaronl...@gmail.com wrote: Thanks for helping me. Yes, i did couple of things

Re: Lucene Facets Module 4.8.1

2014-06-23 Thread Jigar Shah
. There's no way for the default ctor of FastTaxonomyFacetCounts to determine which indexFieldName to use as it doesn't know which dimensions you're going to ask to count. Hope that helps. Shai On Sun, Jun 22, 2014 at 4:05 PM, Jigar Shah jigaronl...@gmail.com wrote: I will try to dig more

Re: Lucene Facets Module 4.8.1

2014-06-23 Thread Jigar Shah
that... Shai On Mon, Jun 23, 2014 at 9:04 AM, Jigar Shah jigaronl...@gmail.com wrote: On commenting //config.setIndexFieldName(CITY, city); at search time, this is before i do, getTopChildren(...) I get following exception. Caused by: java.lang.ArrayIndexOutOfBoundsException

Re: Lucene Facets Module 4.8.1

2014-06-23 Thread Jigar Shah
, or the majority of them, that's ok. But if you know you *always* need the count of a subset of them, then separating that subset to a different field is better. Hope that clarifies. Shai On Mon, Jun 23, 2014 at 4:18 PM, Jigar Shah jigaronl...@gmail.com wrote: Thanks this worked for me

Re: Lucene Facets Module 4.8.1

2014-06-22 Thread Jigar Shah
, TaxonomyReader taxoReader, FacetsConfig config, FacetsCollector fc) throws IOException { super(indexFieldName, taxoReader, config); ... } Thanks Jigar Shah. On Sat, Jun 21, 2014 at 11:01 PM, Shai Erera ser...@gmail.com wrote: If you can, while in debug mode try to note the instance ID

Lucene Facets Module 4.8.1

2014-06-20 Thread Jigar Shah
Hello, I am getting below exception, and using Drillsideways facets. While getting children i am getting below exception: 17:02:10,496 ERROR [stderr:71] (Thread-2 (HornetQ-client-global-threads-790878673)) java.lang.IllegalArgumentException: dimension CITY was not indexed into field $facets

Re: Lucene Facets Module 4.8.1

2014-06-20 Thread Jigar Shah
while indexing and searching. why its not identifying correct name of field, and goes for $facets Please correct me if i understood wrong. or correct way to solve above problem. Many Thanks. Jigar Shah.

Proximity Search for SENTENCE and PARAGRAPH

2014-04-07 Thread Jigar Shah
using SpanQuery Api. Please let me know if some work done for such features, or some proven approach. Thanks Jigar Shah.