IndexReader.getTermFreqVectors(2)[0].getTermFrequencies()[5];
In the above example, Lucene gives me the term frequency of the 5th term
(e.g. say "planet") in the tfv of the corpus document "2".
But I need to get the term frequency for a specified term using its string
value.
E.g.:
term
Hi,
Is there any way to get the total count of terms in the Term Frequency
Vector (tvf)? I need to calculate the Normalized term frequency of each
term in my tvf. I know how to obtain the length of the tvf, but it doesn't
work since I need to count duplicate occurrences as well.
Highly
Hi,
I have a document collection with hundreds of documents. I need to do know
the term frequency for a given query term in each document. I know that
'hit.score' will give me the Lucene score for each document (and it
includes term frequency as well). But I need to call only term frequencies
in
Thanks Adrien.
On Mon, Mar 27, 2017 at 6:56 PM, Adrien Grand <jpou...@gmail.com> wrote:
> You can use IndexSearcher.explain to see how the score was computed.
>
> Le lun. 27 mars 2017 à 14:46, Manjula Wijewickrema <manjul...@gmail.com> a
> écrit :
>
> >
Hi,
Can someone help me to understand the value given by 'hit.score' in Lucene.
I indexed a single document with five different words with different
frequencies and try to understand this value. However, it doesn't seem to
be normalized term frequency or tf-idf. I am using Lucene 2.91.
Any help
Hi,
I tried to index bigrams from a documhe system gave and the system gave me
the following output with the frequencies of the bigrams(output 1):
array size:15
array terms are:{contents: /1, assist librarian/1, assist manjula/2, assist
sabaragamuwa/1, fine manjula/1, librari manjula/1,
Hi,
Could please explain me how to determine the tf-idf score for bigrams. My
program is able to index and search bigrams correctly, but it does not
calculate the tf-idf for bigrams. If someone can, please help me to resolve
this.
Regards,
Manjula.
having the bigram. I hope this is fine.
Alternatively, use NGramTokenizer where ( n=2 in your case) while
indexing. In such a case, each bigram can interpreted as a normal lucene
term.
Thanks,
Parnab
On Wed, Jul 2, 2014 at 8:45 AM, Manjula Wijewickrema manjul...@gmail.com
wrote:
Hi
Hi,
In my programme, I tried to select the most relevant document based on
bigrams.
System gives me the following output.
{contents: /1, assist librarian/1, assist manjula/2, assist sabaragamuwa/1,
fine manjula/1, librari manjula/1, librarian sabaragamuwa/1, main
librari/2, manjula assist/4,
Dear Steve,
It works. Thanks.
On Wed, Jun 11, 2014 at 6:18 PM, Steve Rowe sar...@gmail.com wrote:
You should give sw rather than analyzer in the IndexWriter actor.
Steve
www.lucidworks.com
On Jun 11, 2014 2:24 AM, Manjula Wijewickrema manjul...@gmail.com
wrote:
Hi,
In my
Hi,
In my programme, I can index and search a document based on unigrams. I
modified the code as follows to obtain the results based on bigrams.
However, it did not give me the desired output.
*
*public* *static* *void* createIndex() *throws* CorruptIndexException,
Hi,
What are the other disadvantages (other than the time factor) of creating
index for every request?
Manjula.
On Thu, Jun 5, 2014 at 2:34 PM, Aditya findbestopensou...@gmail.com wrote:
Hi Rajendra
You should NOT create index writer for every request.
Whether it is time consuming to
...@gmail.com wrote:
Hi Manjula,
Sounds like ShingleFilter will do what you want:
http://lucene.apache.org/core/4_6_0/analyzers-common/org/apache/lucene/analysis/shingle/ShingleFilter.html
Steve
www.lucidworks.com
On Dec 22, 2013 11:25 PM, Manjula Wijewickrema manjul...@gmail.com
wrote
Dear All,
My Lucene programme is able to index single words and search the most
matching documents (based on term frequencies) documents from a corpus to
the input document.
Now I want to index two word phrases and search the matching corpus
documents (based on phrase frequencies) to the input
Dear list,
My Lucene programme is able to index single words and search the most
matching documents (based on term frequencies) documents from a corpus to
the input document.
Now I want to index two word phrases and search the matching corpus
documents (based on phrase frequencies) to the input
and then add your own words to it. You could
then initialize the analyzer using this new stop set instead of the default
stop set.
Hope that helps.
--
Anshum Gupta
http://ai-cafe.blogspot.com
On Tue, Dec 21, 2010 at 9:20 AM, manjula wijewickrema
manjul...@gmail.comwrote:
Hi,
1) In my
Hi,
1) In my application, I need to add more words to the stop word list.
Therefore, is it possible to add more words into the default lucene stop
word list?
2) If is it possible, then how can I do this?
Appreciate any comment from you.
Thanks,
Manjula.
directory
to
your project and maintaining your own grammar-based tokenizer.
Best
Erick
On Tue, Nov 30, 2010 at 12:06 AM, manjula wijewickrema
manjul...@gmail.comwrote:
Hi Steve,
Thanx a lot for your reply. Yes there are only two classes and it's
corrcet
that the way you have realized
Hi,
In my work, I am using Lucene and two java classes. In the first one, I
index a document and in the second one, I try to search the most relevant
document for the indexed document in the first one. In the first java class,
I use the SnowballAnalyzer in the createIndex method and
analysis, rather than StandardAnalyzer.
Steve
-Original Message-
From: manjula wijewickrema [mailto:manjul...@gmail.com]
Sent: Monday, November 29, 2010 4:32 AM
To: java-user@lucene.apache.org
Subject: Analyzer
Hi,
In my work, I am using Lucene and two java classes
Hi,
Thanks a lot for your information.
Regards,
Manjula.
On Fri, Jul 23, 2010 at 12:48 PM, tarun sapra t.sapr...@gmail.com wrote:
You can use HibernateSearch to maintain the synchronization between Lucene
index and Mysql RDBMS.
On Fri, Jul 23, 2010 at 11:16 AM, manjula wijewickrema
Hi,
Normally, when I am building my index directory for indexed documents, I
used to keep my indexed files simply in a directory called 'filesToIndex'.
So in this case, I do not use any standar database management system such
as mySql or any other.
1) Will it be possible to use mySql or any
Hi Koji,
Thanks for your information
Manjula
On Fri, Jul 9, 2010 at 5:04 PM, Koji Sekiguchi k...@r.email.ne.jp wrote:
(10/07/09 19:30), manjula wijewickrema wrote:
Uwe, thanx for your comments. Following is the code I used in this case.
Could you pls. let me know where I have to insert
with any MaxfieldLength 5,000.
HTH
Erick
On Mon, Jul 12, 2010 at 4:00 AM, manjula wijewickrema
manjul...@gmail.comwrote:
Hi,
I have seen that, onece the field length of a document goes over a
certain
limit (
http://lucene.apache.org/java/2_9_3/api/all/org/apache/lucene/index
Hi Rebecca,
Thanks for your valuble comments. Yes I observed tha, once the number of
terms of the goes up, fieldNorm value goes down correspondingly. I think,
therefore there won't be any default due to the variation of total number of
terms in the document. Am I right?
Manjula.
On Thu, Jul 8,
Hi,
I run a single programme to see the way of scoring by Lucene for single
indexed document. The explain() method gave me the following results.
***
Searching for 'metaphysics'
Number of hits: 1
0.030706111
0.030706111 = (MATCH) fieldWeight(contents:metaphys in 0), product
removed stop words, so the norm is not what you exspect?
-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de
-Original Message-
From: manjula wijewickrema [mailto:manjul...@gmail.com]
Sent: Friday, July 09, 2010 9:21 AM
To: java
Thanx
On Fri, Jul 9, 2010 at 1:10 PM, Uwe Schindler u...@thetaphi.de wrote:
Thanks for your valuble comments. Yes I observed tha, once the number of
terms of the goes up, fieldNorm value goes down correspondingly. I think,
therefore there won't be any default due to the variation of total
like
System.out.println(indexSearcher.explain(query, 0));
See the javadocs for details.
--
Ian.
On Tue, Jul 6, 2010 at 7:39 AM, manjula wijewickrema
manjul...@gmail.com wrote:
Dear Grant,
Thanks a lot for your guidence. As you have mentioned, I tried to use
explain() method to get
();
Document document = hit.getDocument();
String path = document.get(*FIELD_PATH*);
System.*out*.println(Hit: + path);
}
}
}
On Mon, Jul 5, 2010 at 7:46 PM, Grant Ingersoll gsing...@apache.org wrote:
On Jul 5, 2010, at 5:02 AM, manjula wijewickrema wrote:
Hi,
In my application, I
Hi,
In my application, I input only single term query (at one time) and get back
the corresponding scorings for those queries. But I am little struggling of
understanding Lucene scoring. I have reffered
http://lucene.apache.org/java/2_4_0/api/org/apache/lucene/search/Similarity.html
and
some
.
On Fri, Jun 11, 2010 at 11:20 AM, manjula wijewickrema
manjul...@gmail.com wrote:
Hi,
Using the following programme I was able to get the entire file path of
indexed files which matched with the given queries. But my intention is
to
get only the file names even without .txt extention
Hi,
Using the following programme I was able to get the entire file path of
indexed files which matched with the given queries. But my intention is to
get only the file names even without .txt extention as I need to send these
file names as labels to another application. So, pls. let me know how
Dear Grant,
Thanks for your reply.
Manjula
On Mon, May 24, 2010 at 4:37 PM, Grant Ingersoll gsing...@apache.orgwrote:
On May 20, 2010, at 5:15 AM, manjula wijewickrema wrote:
Hi,
I wrote aprogram to get the ferquencies and terms of an indexed document.
The output comes as follows
and freqs are arrays. Try terms[i] and freqs[i].
--
Ian.
On Mon, May 17, 2010 at 12:23 PM, manjula wijewickrema
manjul...@gmail.com wrote:
Hi,
I wrote a code with a view to display the indexed terms and get their
term
frequencies of a single document. Although it displys
Hi,
I wrote aprogram to get the ferquencies and terms of an indexed document.
The output comes as follows;
If I print : +tfv[0]
Output:
array terms are:{title: capabl/1, code/2, frequenc/1, lucen/4, over/1,
sampl/1, term/4, test/1}
In the same way I can print terms[i] and freqs[i], but the
the instructions here for getting the source:
http://wiki.apache.org/lucene-java/HowToContribute
HTH
Erick
On Sat, May 15, 2010 at 1:49 AM, manjula wijewickrema
manjul...@gmail.comwrote:
Hi,
I am struggling with using HighFreTerms class for the purpose of find
high
fre. terms in my index
Hi,
I wrote a code with a view to display the indexed terms and get their term
frequencies of a single document. Although it displys those terms in the
index, it does not give the term frequencies. Instead it displays ' frequencies
are:[...@80fa6f '. What's the reason for this. The code I have
Dear Ian,
I changed it as you said and now it is working nicely. Thanks a lot for your
kind help.
Manjula
On Mon, May 17, 2010 at 6:46 PM, Ian Lea ian@gmail.com wrote:
terms and freqs are arrays. Try terms[i] and freqs[i].
--
Ian.
On Mon, May 17, 2010 at 12:23 PM, manjula
() return? You don't appear to be doing anything
with the String term in for ( String term : vector.getTerms() ) -
presumably you intend to.
--
Ian.
On Thu, May 13, 2010 at 1:16 PM, manjula wijewickrema
manjul...@gmail.com wrote:
Dear Ian,
Thanks a lot for your immediate reply. As you
Hi,
Is it possible to put the indexed terms into an array in lucene. For
example, imagine I have indexed a single document in Lucene and now I want
to acces those terms in the index. Is it possible to retrieve (call) those
terms as array elements? If it is possible, then how?
Thanks,
Manjula
, Andrzej Bialecki a...@getopt.org wrote:
On 2010-05-14 11:35, manjula wijewickrema wrote:
Hi,
Is it possible to put the indexed terms into an array in lucene. For
example, imagine I have indexed a single document in Lucene and now I
want
to acces those terms in the index. Is it possible
class in my
code. But I was unable to find any guidence of how to do it? If you can pls.
be kind enough to tell me how can I use this class in my code.
Thanx
Manjula
On Fri, May 14, 2010 at 6:16 PM, Andrzej Bialecki a...@getopt.org wrote:
On 2010-05-14 14:24, manjula wijewickrema wrote:
Hi
Hi,
I am struggling with using HighFreTerms class for the purpose of find high
fre. terms in my index. My target is to get the high frequency terms in an
indexed document (single document). To do that I have added
org.apache.lucene.misc package into my project. I think upto that point I am
Orange
-Original Message-
From: manjula wijewickrema manjul...@gmail.com
Date: Tue, 11 May 2010 15:13:12
To: java-user@lucene.apache.org
Subject: Re: Class_for_HighFrequencyTerms
Dear Erick,
I lokked for it and even added IndexReader.java and TermFreqVector.java
from
http
Dear All,
I am trying to get the term frequencies (through TermFreqVector) of a
document (using Lucene 2.9.1). In order to do that I have used the following
code. But there is a compile time error in the code and I can't figure it
out. Could somebody can guide me what's wrong with it.
Compile
);
with
IndexReader ir = whatever(...);
TermFreqVector vector = ir.getTermFreqVector(0, fieldname );
And you'll need to move it to after the writer.close() call if you
want it to see the doc you've just added.
--
Ian.
On Thu, May 13, 2010 at 11:07 AM, manjula wijewickrema
manjul
at TermFreqVector?
Best
Erick
On Mon, May 10, 2010 at 8:10 AM, manjula wijewickrema
manjul...@gmail.comwrote:
Hi,
If I index a document (single document) in Lucene, then how can I get the
term frequencies (even the first and second highest occuring terms) of
that
document? Is there any
On Fri, May 7, 2010 at 2:22 PM, manjula wijewickrema
manjul...@gmail.com
wrote:
Hi,
I am using Lucene 2.9.1 . I have downloaded and run the
'HelloLucene.java'
class by modifing the input document and user query in various ways.
Once
I
put the document sentenses as 'Lucene
Hi,
If I index a document (single document) in Lucene, then how can I get the
term frequencies (even the first and second highest occuring terms) of that
document? Is there any class/method to do taht? If anybody knows, pls. help
me.
Thanks
Manjula
Hi,
I am using Lucene 2.9.1 . I have downloaded and run the 'HelloLucene.java'
class by modifing the input document and user query in various ways. Once I
put the document sentenses as 'Lucene in actions' insted of 'Lucene in
action', and I gave the query as 'action' and run the programme. But it
Hi,
I am new to Lucene. If I want to know the term or phrase frequency of an
input document, will it be possible through Lucene?
Thanks,
Manjula
52 matches
Mail list logo