Re: Lucene , hits per document

2011-01-30 Thread sharma
Grant Ingersoll apache.org> writes: With a little logic on your size to count, you can use SpanQueries to do that. -Grant On Jan 21, 2011, at 4:03 PM, Sharma Kollaparthi wrote: Hi , I have started to use Lucene for searching in HTML files. Is it possible to get Hits

Using coord of one BooleanQuery as a multiplier for its siblings

2012-08-28 Thread pranshu sharma
Hi there, I had a question about migrating the coord value one level up. My current query structure has a root BooleanQuery with a bunch of nested BooleanQuery children: one of these looks for all terms in the query issued, and I want to apply the coord factor for this BooleanQuery to all its sibl

Document Order in IndexWriter.addIndexes

2010-06-29 Thread Apoorv Sharma
while calling addindexes or addindexes with no optimize can any gurantee be given about the document order in the new documents given that the order of directories/indexreader is fixed. So is it that ith document coming from jth indexreader will always have some x(i,j) position in the final merged

Re: Document Order in IndexWriter.addIndexes

2010-06-30 Thread Apoorv Sharma
This implies there is no way to merge two parallel indexes(based on parallel reader) to get a new parallel index. Correct me if I am wrong. On Tue, Jun 29, 2010 at 11:24 PM, Andrzej Bialecki wrote: > On 2010-06-30 05:12, Apoorv Sharma wrote: > > while calling addindexes or addindexe

Lucene , hits per document

2011-01-21 Thread Sharma Kollaparthi
using termfrequency vector. Thanks, Sharma -- Sharma Kollaparthi CDU Systems & Process Tools Software Developer I ANSYS INC.

Lucene index corruption on HDFS

2014-07-14 Thread varun sharma
I am building my code using Lucene 4.7.1 and Hadoop 2.4.0 . Here is what I am trying to do Create Index 1. Build index in RAMDirectory based on data stored on HDFS . 2. Once built , copy the index onto HDFS. Search Index 1. Bring in the index stored on HDFS into RAMDirector

Re: Lucene index corruption on HDFS

2014-08-20 Thread varun sharma
Please do help here. Thank you , Varun. On Tuesday, 15 July 2014 2:14 PM, varun sharma wrote: I am building my code using Lucene 4.7.1 and Hadoop 2.4.0 . Here is what I am trying to do Create Index 1. Build index in RAMDirectory based on data stored on HDFS . 2. Once

Sampled Queries -- Use Cases and Feedback

2019-06-07 Thread Atri Sharma
Hi All, While working on a new Query type, I was inclined to think of a couple of use cases where the documents being scored need not be all of the data set, but a sample of them. This can be useful for very large datasets, where a query is only interested in getting the "feel" of the data, and ot

Re: Lucene FuzzyQuery

2019-06-07 Thread Atri Sharma
Is your FuzzyQuery matching any documents at all? It would be helpful if you could post your entire query. It might be happening that your Fuzzy query is not matching any hits, but when you specify it as a MUST clause, then it becomes a necessary condition for any hit to be returned by your overal

Re: Lucene FuzzyQuery

2019-06-07 Thread Atri Sharma
>However, with MUST > clause, that restriction is lifted. I meant that with a SHOULD clause, that restriction is lifted i.e. a query can score hits even if SHOULD clause does not match the hit (but other MUST clauses do match).

Re: FuzzyQuery

2019-06-09 Thread Atri Sharma
On Sun, Jun 9, 2019 at 8:53 PM Tomoko Uchida wrote: > > Hi, > > What analyzer do you use for the text field? Is the term "Main" > correctly indexed? Agreed. Also, it would be good if you could post your actual code. What analyzer are you using? If you are using StandardAnalyzer, then all of your

Re: Sampled Queries -- Use Cases and Feedback

2019-06-09 Thread Atri Sharma
Any thoughts on this? I am envisioning applications to machine learning systems, where the training dataset might be a small sample of the entire dataset, and the user wants scoring to be done only on samples of the dataset. On Fri, Jun 7, 2019 at 5:45 PM Atri Sharma wrote: > > Hi All, >

Re: Lucene FuzzyQuery

2019-06-10 Thread Atri Sharma
> i make sure i specify a string with 1 edit away misspelled and that > never gets hit but the word with correct spelling is in the index. How long are your query terms and the actual word? For fuzzy query to match, your edit distance needs to be less than the smaller of the query and the actual w

Re: FuzzyQuery

2019-06-10 Thread Atri Sharma
ing > in the call. > > Best regards > > > > On 6/10/19 10:47 AM, baris.ka...@oracle.com wrote: > > How do i check how it is indexed? lowecase or uppercase? > > > > only way is now to by testing. > > > > i am using standardanalyzer. > > > >

Re: Incremental Lucene Index

2019-06-24 Thread Atri Sharma
Yes, Lucene supports incremental indexing. Note that the underlying structure is append only, so you are still paying the cost of delete + insert, but the semantics are what you expect them to be. On Mon, 24 Jun 2019 at 7:18 PM, Sukhendu Kumar Biswal wrote: > Hi Team, > Does Lucene support incre

Re: how to find out each score contribution from booleanquery components

2019-06-26 Thread Atri Sharma
It depends a lot on the actual clauses (whether they are SHOULD, MUST, MUST_NOT), each query’s type (phrase, term etc). Could you post your query and the explain plan of IndexSearcher post the rewrite? On Wed, 26 Jun 2019 at 6:46 PM, wrote: > Hi,- > > how can one find out each score contribut

Re: how to find out each score contribution from booleanquery components

2019-06-26 Thread Atri Sharma
n required clause (+countryDFLT:united > (countryDFLT:uniten)^0.4202 +countryDFLT:states > (countryDFLT:statesir)^0.56) > 0.0 = Failure to meet condition(s) of required/prohibited clause(s) >0.0 = no match on required clause (countryDFLT:united) > 0.0 = no matching

Re: Multi field Lucene index

2019-07-05 Thread Atri Sharma
Should not matter, AFAIK. If your first MUST clause in a BooleanQuery fails to match for a document, then there is no point for the engine to match further clauses, right? On Fri, Jul 5, 2019 at 7:56 PM wrote: > > Re-sending and please let me know Your amazing thoughts > > Happy July 4th > > Bes

Re: Impact and WAND

2019-07-11 Thread Atri Sharma
ments in postings lists. > Then this information is leveraged by block-max WAND in order to skip > low-scoring blocks. > > This does indeed help avoid reading norms, but also document IDs and > term frequencies. > > On Wed, Jul 10, 2019 at 4:10 PM Wu,Yunfeng > mailto:wuyunfen.

Re: Lucene 5.2.1 score for MUST_NOT query

2019-08-04 Thread Atri Sharma
MUST_NOT represents a clause which must not match against a document in order for it to be qualified as a hit (think of SQL’s NOT IN). MUST_NOT clauses are used as filters to eliminate candidate documents. On Sun, 4 Aug 2019 at 23:11, Claude Lepere wrote: > Hello! > > What score of a hit in res

Re: partial match

2019-08-04 Thread Atri Sharma
It is not very clear as to what is it that you are trying to achieve here. If you want to match similar terms as the one you specify in the query (test, tesk, lest etc), then a fuzzy query (~) should suffice. Note that you cannot specify a mandatory part of the text that has to match in every resul

Re: partial match

2019-08-05 Thread Atri Sharma
Yes, that will allow specifying wildcard as the first character, but it can lead to very slow queries, especially on larger indices. On Mon, Aug 5, 2019 at 6:08 PM wrote: > > Does QueryParser.setAllowLeadingWildCard(true) work? > > this will allow to use wildcard as first char in the search strin

Re: Parameterized queries in Lucene

2019-10-21 Thread Atri Sharma
I am curious — what use case are you targeting to solve here? In relational world, this is useful primarily due to the fact that prepared statements eliminate the need for re planning the query, thus saving the cost of iterating over a potentially large combinatorial space. However, for Lucene, th

Re: Parameterized queries in Lucene

2019-10-23 Thread Atri Sharma
query many times with a different parameter means recreating the > > Query > > > every time. > > > > > > I admit that creation of the Lucene query is not the most expensive > part > > of > > > the planning process still we can gain something by not creati

Re: Lucene index directory grows and shrinks

2019-11-04 Thread Atri Sharma
This are typical symptoms of an index merge. However, it is hard to predict more without knowing more data. What is your segment size limit? Have you changed the default merge frequency or max segments configuration? Would you have an estimate of ratio of number of segments reaching max limit / to

Re: PhraseQuery

2020-01-24 Thread Atri Sharma
PhraseQuery enforces the order of terms specified and needs an exact match of order of terms unless slop is specified. When appending terms, term pos numbers need to be incremental in the builder On Fri, Jan 24, 2020 at 11:15 PM wrote: > > Hi,- > > how do i enforce the order of sequence of ter

Re: Resizable LRUQueryCache

2020-03-05 Thread Atri Sharma
On Fri, Mar 6, 2020 at 1:04 AM Aadithya C wrote: > > In my personal opinion, there are a few advantages of resizing - > > > 1) The size of the cache is unpredictable as there is a fixed(guesstimate) > accounting for the key size. With a resizable cache, we can potentially > cache heavier queries a

Re: [VOTE] Lucene logo contest, third time's a charm

2020-09-01 Thread Atri Sharma
D (binding) On Wed, 2 Sep 2020 at 01:51, Ryan Ernst wrote: > Dear Lucene and Solr developers! > > > > Sorry for the multiple threads. This should be the last one. > > > > In February a contest was started to design a new logo for Lucene > > [jira-issue]. The initial attempt [first-vote] to call

[ANNOUNCE] Apache Lucene 8.7.0 released

2020-11-04 Thread Atri Sharma
03/11/2020, Apache Lucene™ 8.7 available The Lucene PMC is pleased to announce the release of Apache Lucene 8.7. Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text

Re: Potential bug

2021-06-14 Thread Atri Sharma
+1 to Adrien. Let's keep the tone neutral. On Mon, 14 Jun 2021, 16:00 Adrien Grand, wrote: > Baris, you called out an insult from Alessandro and your replies suggest > anger, but I couldn't see an insult from Alessandro actually. > > +1 to Alessandro's call to make the tone softer on this discu

Lucene cpu utilization & scoring

2021-08-20 Thread Varun Sharma
Hi, We have a large index that we divide into X lucene indices - we use lucene 6.5.0. On each of our serving machines serves 8 lucene indices in parallel. We are getting realtime updates to each of these 8 indices. We are seeing a couple of things: a) When we turn off realtime updates, performanc

Re: Lucene cpu utilization & scoring

2021-08-20 Thread Varun Sharma
osts you are seeing are related to computing scores and not required > for matching? > > -Mike > > On Fri, Aug 20, 2021 at 2:02 PM Varun Sharma > wrote: > > > > Hi, > > > > We have a large index that we divide into X lucene indices - we use > lucene &g

Need help on aggregation of nested documents

2021-11-14 Thread Gopal Sharma
and have done transactions worth more than 500$ between two date ranges. The queries can go deeper than this. Thanks in advance. Gopal Sharma

Re: Need help on aggregation of nested documents

2021-11-15 Thread Gopal Sharma
n and then finally doing the aggregates. Is there any other way around this? Thanks Gopal Sharma On Mon, Nov 15, 2021 at 10:36 PM Adrien Grand wrote: > It's not straightforward as we don't provide high-level tooling to do this. > You need to use the BitSetPro

Re: Need help on aggregation of nested documents

2021-11-16 Thread Gopal Sharma
reader.document(int docID) and then parse it which would be again the same issue i pointed out. Thanks Gopal Sharma On Tue, Nov 16, 2021 at 1:41 PM Adrien Grand wrote: > Indeed you shouldn't load all hits, you should register a > org.apache.lucene.search.Collector that will aggregate

Pattern Capture Group Token Filter

2022-04-17 Thread dishant sharma
Hi, all! I am currently using the default lucene's pattern capture token filter in one of my projects where i have to utilize it for pattern matching. The issue with it is that the default pattern capture token filter gives the same start and end offset for each generated token: the start and end o

pattern capture token filter

2022-04-23 Thread dishant sharma
I am currently making some changes in the default pattern capture group token filter code to meet my requirement. I am a beginner in JAVA so finding it a bit hard to fully understand the code and make changes, I have successfully done my changes in the increment token() method and got the desired r

Re: [HELP] Link your Apache Lucene Jira and GitHub account ids before Thursday August 4 midnight (in your local time)

2022-08-01 Thread Atri Sharma
Mine is atris for github, atri for JIRA On Mon, Aug 1, 2022 at 4:03 PM Tomoko Uchida wrote: > > Hi Mike, Marcus, and Praveen: > > I verified the added two mappings - these Jira users have activity on > Lucene Jira, also corresponding GitHub accounts are valid. > - marcussorealheis > - pru30 > > T

Re: [ANNOUNCE] Issue migration Jira to GitHub starts on Monday, August 22

2022-08-25 Thread Vigya Sharma
Love this! Thanks for all the hard work, Tomoko. - Vigya On Wed, Aug 24, 2022 at 12:27 PM Michael Sokolov wrote: > Thanks! It seems to be working nicely. > > Question about the fix-version: tagging. I wonder if going forward we > want to main that for new issues? I happened to notice there is a

Re: Is there a way to customize segment names?

2022-12-30 Thread Vigya Sharma
Hi Patrick, This is an interesting question, and from what I understood, I see correctness problems in what you're trying to implement. Let me make sure I understand correctly... So indexer-1 created segments 1,2,3,4 and indexer-2 created segments 1', 2', 3', 4' independently (they just have the

Proposal to Reimplement Disk Usage API - Request for Feedback and Collaboration

2023-05-24 Thread Deepika Sharma
Dear Community I am writing to share thoughts on the existing Disk Usage API, I believe there is an opportunity to improve its functionality and performance through a reimplementation. Currently, the best tool we have for this is based on a custom Codec that separates storage by field; to get the

Lucene Index Writer in a distributed system

2023-10-19 Thread Gopal Sharma
Hello Team, I am new to Lucene and want to use Lucene in a distributed system to write in a Amazon EFS index. As per my understanding, the index writer for a particular index needs to be opened by 1 server only. Is there a way we can achieved this in distributed system to write parallelly in Luce

Excessive reads while doing commit in lucene

2024-09-04 Thread Gopal Sharma
lyzer); writter = new IndexWriter(indexDirectory, indexWriterConfig); Can someone please help to understand why such huge reads are happening? and how to mitigate such issues. Thanks in advance Gopal Sharma

Somewhat complex scoring/boosting

2008-09-05 Thread Ravindra Sharma
Hi Folks, I have somewhat complex scoring/boosting requirement. Say I have 3 text fields A, B, C and a Numeric field called D. Say My query is "testrank". Scoring should be based on following: Query matches 1. text fields A, B and C, & Highest value of D (highest boost/rank) 2. A and B, & Highe

Custom scoring example ...

2008-09-05 Thread Ravindra Sharma
I am looking for an example if anyone has done any custom scoring with Lucene. I need to implement a Query similar to DisjunctionMaxQuery, the only difference would be it should score based on sum of score of sub queries' scores instead of max. Any custom scoring example will help. (On one hand,

Counting entries in an index

2010-02-20 Thread Apoorv Sharma
Hello, I am trying to count the total of number of posting entries for terms having a given prefix in an index. Also count the number of such terms in the index. The following is the code I am using for that. The problem is the result is not as expected. Can you point out if what am I doing som

Re: Scanning docs at index time

2010-02-22 Thread Apoorv Sharma
I don't know of classes which will be suitable but if they are ordered queries a simple code could easily be written. On Mon, Feb 22, 2010 at 9:59 PM, Nigel wrote: > I'd like to scan documents as they're being indexed, to find out > immediately > if any of them match certain queries. The goal i

Re: Prioiritze new documents

2008-01-01 Thread Shailendra Sharma
ints: > http://wiki.apache.org/lucene-java/BasicsOfPerformance > http://wiki.apache.org/lucene-java/LuceneFAQ > > > > > > - > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > > -- Shailendra Sharma +91-988-011-3066

Re: CachingWrapperFilter: why cache per IndexReader?

2008-01-01 Thread Shailendra Sharma
t Ingersoll > http://lucene.grantingersoll.com > http://www.lucenebootcamp.com > > Lucene Helpful Hints: > http://wiki.apache.org/lucene-java/BasicsOfPerformance > http://wiki.apache.org/lucene-java/LuceneFAQ > > > > > > - > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > > -- Shailendra Sharma +91-988-011-3066

RE: how to all documents from

2008-02-29 Thread Shailendra Sharma
Create a match all docs query like following: MatchAllDocsQuery matchAllDocsQuery = new MatchAllDocsQuery(); And then search as you search for any other query - searcher.search(matchAllDocsQuery) - it returns hit Thanks, Shailendra -Original Message- From: sandyg [mailto:[

indexing unsupported mime types using Lucene

2008-06-18 Thread Gaurav Sharma
Hi, I am using Lucene for indexing and searching the documents. Its working file for supported documents. Now i want to index documents with unsupported mime types. Right now i am using LIUS which is built over Lucene for indexing the documents. Is there any tool which I can use for indexing the

Re: indexing unsupported mime types using Lucene

2008-06-19 Thread Gaurav Sharma
Otis > -- > Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch > > > - Original Message >> From: Gaurav Sharma <[EMAIL PROTECTED]> >> To: java-user@lucene.apache.org >> Sent: Wednesday, June 18, 2008 10:07:22 AM >> Subject: indexing unsupp

too many clauses exception

2008-07-03 Thread Gaurav Sharma
Hi, I am stuck with one more exception. When i am using a wild card such as a* i am getting too many clauses exception. It saying maximum clause count is set to 1024. Is there any way to increase this count. Can u please help me out in overcoming this. Thanks in advance. -Gaurav - -Gaura

too many clauses exception

2008-07-03 Thread Gaurav Sharma
Hi, I am stuck with an exception in lucene (too many clauses). When i am using a wild card such as a* i am getting too many clauses exception. It saying maximum clause count is set to 1024. Is there any way to increase this count. Can u please help me out in overcoming this. Thanks in advance. -

Clustering in MultiSearcher Searchables

2007-04-24 Thread Sawan Sharma
Hi all, I am using MultiSearcher to search more then one Index folders. I have one Index searcher array which contains 3 Index searchers... 01. C:\IndexFolder1 02. C:\IndexFolder2 03. C:\IndexFolder3 When I searched in 3 index folders using a MultiSearcher then I got 3000 hits. 1 to 1000 from C

Re: Clustering in MultiSearcher Searchables

2007-04-24 Thread Sawan Sharma
earcher(int n) (n would be the docid of result x). Hope this helps, Doron "Sawan Sharma" <[EMAIL PROTECTED]> wrote on 24/04/2007 03:19:47: > Hi all, > > I am using MultiSearcher to search more then one Index folders. I have one > Index searcher array which conta

Group the search results by a given field

2007-05-17 Thread Sawan Sharma
Hi All, I was wondering - is it possible to search and group the results by a given field? For example, I have an index with several million records. Most of them are different Features of the same ID. I'd love to be able to do.. groupby=ID or something like that in the results, and provide the

Re: efficient way to filter out unwanted results

2007-06-14 Thread Sawan Sharma
Hello Jay, I am not sure up to what level I understood your problem . But as far as my assumption, you can try HitCollector class and its collect method. Here you can get DocID for each hit and can remove while searching. Hope it will be useful. Sawan (Chambal.com inc. NJ USA) On 6/15/07,

Facet searching on single field with multiple words value

2007-06-20 Thread Sawan Sharma
Hi friends, I tried to implement the facet searching in a sample code and when I tried it with various case and found no result in one case.I wanted to narrow by one field "title" and gave the multiple word or say phrase. So First, in this preparing the lucene query and converting it into QueryF

Re: How to show category count with results?

2007-07-31 Thread Shailendra Sharma
your piece of code would be really small. Thanks, Shailendra Sharma CTO, Ver Se' Innovation Private Ltd. Bangalore, India On 7/30/07, Dennis Kubes <[EMAIL PROTECTED]> wrote: > > We found that a fast way to do this simply by running a query for each > category and getting the ma

Re: Lucene Field score value

2007-07-31 Thread Shailendra Sharma
Though I am not sure what is the possible use case for thing like below, but here is the pointer: Using IndexSearcher you can get the "Explanation" for the given query and document-id. Complex Explanation has multiple sub-explanations and so forth. Simple Explanation would contain the weight of th

Re: Can I do boosting based on term postions?

2007-07-31 Thread Shailendra Sharma
without re-creating indices everytime. Thanks, Shailendra Sharma, CTO, Ver se' Innovation Pvt. Ltd. Bangalore, India On 8/1/07, Cedric Ho <[EMAIL PROTECTED]> wrote: > > Hi all, > > I was wondering if it is possible to do boosting by search terms' > position in the document.

Re: Can I do boosting based on term postions?

2007-08-02 Thread Shailendra Sharma
; > Cedric, > > > > SpanFirstQuery could be a solution without payloads. > > You may want to give it your own Similarity.sloppyFreq() . > > > > Regards, > > Paul Elschot > > > > On Thursday 02 August 2007 04:07, Cedric Ho wrote: > > > Thanks for

Re: Can I do boosting based on term postions?

2007-08-03 Thread Shailendra Sharma
IL PROTECTED]> wrote: > > > Cedric, > > > > > > SpanFirstQuery could be a solution without payloads. > > > You may want to give it your own Similarity.sloppyFreq() . > > > > > > Regards, > > > Paul Elschot > > > > >

Re: Can I do boosting based on term postions?

2007-08-03 Thread Shailendra Sharma
Ah, Good way ! On 8/4/07, Paul Elschot <[EMAIL PROTECTED]> wrote: > > On Friday 03 August 2007 20:35, Shailendra Sharma wrote: > > Paul, > > > > If I understand Cedric right, he wants to have different boosting > depending > > on search term positions in t

Search problems

2005-10-27 Thread Sharma, Siddharth
Hi My index has 4 keyword fields and one unindexed field. I want to search by the 4 keyword fields and return the one unindexed field. I can iterate over the documents via Luke. But when I search for the same values that I see via Luke, it does not find the document. Out of the 4 fields, 2 are a

RE: Search problems

2005-10-28 Thread Sharma, Siddharth
I figured out the problem when I copied the document from the clipboard. It had trailing spaces. After I changed the database query to have an ltrim(rtrim( for each query, prior to indexing, its fine now. -Original Message- From: Sharma, Siddharth Sent: Thursday, October 27, 2005 4:35

StandardAnalyzer and thread safety

2005-11-01 Thread Sharma, Siddharth
Is using a QueryParser to parse a query using the same, single instance of Analyzer thread-safe? Or should I create a new Analyzer each time? - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PR

JDK version with lucene 1.4.3

2005-11-02 Thread Sharma, Siddharth
I have downloaded Lucene 1.4.3 I am trying to narrow down on the JRE version to use. We have the flexibility to use 1.3.1 up. Which JVM will be the best for running Lucene? I saw a note on the FAQ that said that Lucene will run on 1.3.1 but will require 1.4 to compile. Why would anyone want to com

RE: lucene jar and war

2005-11-02 Thread Sharma, Siddharth
Place the lucene jar file in the WEB-INF/lib directory of your web application prior to creating its war. If your ISP inspects the war and removes all jar files within it, then I suppose you might just have to place all the lucene classes under WEB-INF/classes of your web application as 'loose cla

Greetings and my first question - Is it a good practise to store application configuration in Lucene

2006-01-31 Thread Pradeep Sharma
I have just joined this user group, but I probably will be asking questions / contributing for a while now as I am starting to work on a product which will use Lucene exclusively. Still in the designing phase, and I see that we need to manage several user / application specific configurations

two applications accessing same index

2006-02-05 Thread Pradeep Sharma
I have two applications, one which will be generating all the indexes and the second one which will be reading those indexes. I cannot keep them in the same application, because one will run all the times in batches via some sort of scheduler to generate the indexes and the application which wil

Re: IndexWriter.addIndexes & optimizatio

2006-06-11 Thread vipin sharma
- > Just set your maxBufferedDocs to as high a number as your RAM/heap will let you, and pick a mergeFactor that is high, but doesn't get you in trouble with open files. can you please explaing this in brief?? regards and thanks, On 6/9/06, Otis Gospodnetic <[EMAIL PROTECTED]> wrote: When wri

Getting count on distinct values of a field.

2006-06-12 Thread vipin sharma
Hi, i am having problem in getting the count on distict values of a field. The reason for getting this value is that, each of all documents in index belongs to one predefined class and i want to get the number of documents belonging to each class. Regards..

Is Lucene right for me?

2005-10-08 Thread Sharma, Siddharth
Hi I am complete newbie to Lucene. In fact I'm not even a search guy. I looked up terms such as stemming just yesterday. So this is going to be so much fun ;) Here's the problem I am trying to solve: I work in the B2B space at Staples (an office supplies company in the US). We sell office products

RE: Is Lucene right for me?

2005-10-10 Thread Sharma, Siddharth
Hoss Thanks for the reply. The posting was an excellent write-up and helped me visualize my problem domain and solution better. I like the idea about storing filter information in the contract index indexed by company. It might work in my case. I am not sure if I understand the BitSet solution t

One index or 2 indices

2005-10-11 Thread Sharma, Siddharth
Hiya Given that I have two high level business entities, catalog (containing product information) and contract (containing filter criteria about which products are available for sale and which are not), what is a better approach? 1. To have two different indices and query them separately. OR 2. H

Too many clauses

2005-10-17 Thread Sharma, Siddharth
Query: caught a class org.apache.lucene.queryParser.ParseException with message: Too many boolean clauses I realize why this is happening (the 1024 clauses limit for BooleanQuery). My question is more design related. During customer registration, the customer defines a set of skus/products that

RE: Too many clauses

2005-10-17 Thread Sharma, Siddharth
se the max clause count. //Setting the clause Count BooleanQuery.setMaxClauseCount(int); Can use maxint or some number smaller.. When I set this high, I have had to set the java pool higher for memory as well. Tom -Original Message----- From: Sharma, Siddharth [mailto:[EMAIL PROTECTED] Se

RE: Too many clauses

2005-10-19 Thread Sharma, Siddharth
Thanks Chris I haven't tried it yet, but I think I understand your idea now (after 24 hours, man I'm slow on the uptake;) I'll try it today. -Sid -Original Message- From: Chris Hostetter [mailto:[EMAIL PROTECTED] Sent: Monday, October 17, 2005 5:05 PM To: java-user@lucene.apache.org Sub

RangeFilter source

2005-10-20 Thread Sharma, Siddharth
I downloaded the source code of 1.4.3 but did not find the source of RangeFilter. I could not find it in the sandbox either? RangeFilter, where art thou? - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-ma

Thread safety question

2005-10-25 Thread Sharma, Siddharth
Hi I have an instance (each) of IndexSearcher and StandardAnalyzer housed in a Singleton and I intend to use this one single instance (of Searcher and Analyzer) for multiple concurrent search requests. I vaguely remember reading that I (as a client) do not have to synchronize. Lucene internals take

[ANNOUNCE] Apache Lucene 10.3.0 released

2025-09-13 Thread Vigya Sharma
The Lucene PMC is pleased to announce the release of Apache Lucene 10.3.0. Apache Lucene is a high-performance, full-featured search engine library written entirely in Java. It is a technology suitable for nearly any application that requires structured search, full-text search, faceting, nearest-

Carrot 2 with lucene prototype!!!

2006-06-29 Thread arun sharma\(rinku\)
Hello gentlemen, I am novice to lucene and carrot 2 but I have urgent requirement for building a prototype using lucene and carrot2. Please help me with working web application demo along with code. Thanks Arun - Sneak p