Hi All,
is there any possibility to create compression store for the
following types of string in lucene index store?
String str = "II0264.D05|00022745|ABCDE|03/01/2008 00:23:12|00035|
9840836588| 129382152520| 04F4243B600408|04F4243B600408|
|11919898456123|354943011025810L| "CPTBS2I"| "A
On Tuesday 01 April 2008 18:51:55 Dominique Béjean wrote:
> IndexReader reader = IndexReader.open(temp_index);
> TermEnum terms = reader.terms();
>
> while (terms.next()) {
> String field = terms.term().field();
Gotcha: after calling terms() it's already pointin
Sure; here are the two explanations (below). Your question made me go look
at the explanation more carefully again and (no) surprise, I discovered
that I
misspoke (miswrote) earlier; the two "found" terms are j2ee and soa,
which then makes my "concern" much less of one, since in both cases, th
Donna L Gresh skrev:
I have two slightly different queries,
Hi Donna,
I can't help you, but perhaps I would understand everthing better if you
also pasted in the explanations.
karl
-
To unsubscribe, e-mail: [EMAIL P
Wojtek H skrev:
Snowball stemmers are part of Lucene, but for few languages only. We
org.apache.lucene.analysis contains a few more stemmers.
have documents in various languages and so need stemmers for many
languages (in particular polish).
Have you seen Stempel?
http://www.getopt.org/ste
I have two slightly different queries, and am filtering to return only a
single unique document. The scores are very slightly different, but in the
opposite way from what my (naive) reasoning would have expected.
In the first case the query is
text:"j2ee"^2.0, text:"soa"^2.0, text:webservic, tex
We use Lucene to create simple data stores that we deploy with our
application. Our application also supports auto-updating and we refresh
these data stores monthly. Since Lucene computes the names for the index we
end up deploying new files each time while leaving the old files to continue
taki
I registered myself just now, an interesting website.
It seems crossfeeds generate a tag cloud offline hourly ? But I have a more
strict time requirement. user submit a query in my website, and they may get
tens of thousands of search results. I need to generate a tag cloud for all
these docu
See Chris's reply, but for this <<>>
I think you want PerFieldAnalyzerWrapper.
Erick
On Mon, Mar 31, 2008 at 10:56 AM, Itamar Syn-Hershko <[EMAIL PROTECTED]>
wrote:
>
> Well, here is the thing - I don't necessarily want to get results per
> paragraphs - which your code will do just fine for. I
On www.crossfeeds.com, I use this method in order to update hourly a tag
cloud based on the title of 20.000 RSS articles (RSS published during the
last 24 hours). It takes 1 minute.
-Message d'origine-
De : wuqi [mailto:[EMAIL PROTECTED]
Envoyé : mardi 1 avril 2008 14:10
À : java-user@l
so build a index for the dynamically generated docucements set ,and then try
to find frequency for each terms in this index... not sure it's fast enoug.but
it's worth to have a try...
Thank you Doinique!
- Original Message -
From: "Dominique Béjean" <[EMAIL PROTECTED]>
To:
Sent: Tues
Hi all,
Snowball stemmers are part of Lucene, but for few languages only. We
have documents in various languages and so need stemmers for many
languages (in particular polish). One of the ideas is to use ispell
dictionaries. There are ispell dicts for many languages and so this
solution is good fo
May be you can index the set of documents in a temporary index. This index
needs only one field (tag).
Then you can browse the terms collection of the index and get each couple
term/frequency
IndexReader reader = IndexReader.open(temp_index);
TermEnum terms = reader.terms();
OK, I opened LUCENE-1254 and committed the fix to trunk & (upcoming)
2.3.2.
Mike
Yonik Seeley wrote:
On Mon, Mar 31, 2008 at 5:19 AM, Michael McCandless
<[EMAIL PROTECTED]> wrote:
I think we should remove those checks and allow addIndexesNoOptimize
to import and index even if it has segm
14 matches
Mail list logo