I am trying to solve the following search problem. Say we have 10 different
documents d1..d10 Each document contains a type of data say, d1 -> list of
movie names, d2 -> list of actor names, d3 -> list of addresses etc. Each
document contains list of entities and scores. So d1 contains movie names
My suggestion is you not worry about the docId, in practice it is an
"internal lucene" id, quite similar with a rowId on a database, each index
may generate a different docId (it is their problem) from a translated
document, you may use your own ID that relates one document to another on
different
Hello.
Sorry to bring this up again. I don't want to be rudeand I mean no
disrespect, but after thinking it through today,
I need to and would really love to have the answer to the following
question :
1) At lucene indexing time, is it possible to rewrite a read-only index
so that some field
You can do it.
Choose reasonable alogrith.
Analyzer written by self is needed also.
--
View this message in context:
http://lucene.472066.n3.nabble.com/Encryption-tp539373p4133687.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.
Created LUCENE-5633 for it.
On Tue, Apr 29, 2014 at 6:28 PM, Shai Erera wrote:
> NoMP means no merges, and indeed it seems silly that NoMP distinguishes
> between compound/non-compound settings. Perhaps it's rooted somewhere in
> the past, I don't remember.
>
> I checked and IndexWriter.addInde
NoMP means no merges, and indeed it seems silly that NoMP distinguishes
between compound/non-compound settings. Perhaps it's rooted somewhere in
the past, I don't remember.
I checked and IndexWriter.addIndexes consults
MP.useCompoundFile(segmentInfo) when it adds the segments. But maybe
NoMP.useCo
+1 to just have NoMergePolicy.INSTANCE
Mike McCandless
http://blog.mikemccandless.com
On Tue, Apr 29, 2014 at 8:07 AM, Robert Muir wrote:
> I think NoMergePolicy.NO_COMPOUND_FILES and
> NoMergePolicy.COMPOUND_FILES should be removed, and replaced with
> NoMergePolicy.INSTANCE
>
> If you want t
Thanks for the response. I was not aware of IWC.setUseCompoundFile .
@Shai this is what I feel is confusing - From what I understand
NoMergePolicy means no merges. Hence why have two separate options?
On Tue, Apr 29, 2014 at 5:44 PM, Shai Erera wrote:
> The problem is that compound files se
On Tue, Apr 29, 2014 at 8:14 AM, Shai Erera wrote:
>
> If we only offer NoMP.INSTANCE, what would it do w/ merged segments? always
> compound? always not-compound?
it doesnt merge though.
-
To unsubscribe, e-mail: java-user-unsu
The problem is that compound files settings are split between MergePolicy
and IndexWriterConfig. As documented on IWC.setUseCompoundFile, this
setting controls how new segments are flushed, while the MP setting
controls how merged segments are written.
If we only offer NoMP.INSTANCE, what would it
I think NoMergePolicy.NO_COMPOUND_FILES and
NoMergePolicy.COMPOUND_FILES should be removed, and replaced with
NoMergePolicy.INSTANCE
If you want to change whether CFS is used by indexwriter flush, you
need to set that in IndexWriterConfig.
On Tue, Apr 29, 2014 at 8:03 AM, Varun Thacker
wrote:
>
I wanted to use the NoMergePolicy.NO_COMPOUND_FILES to ensure that no
merges take place on the index. However I was unsuccessful at it. What I am
doing wrong here.
Attaching a gist with -
1. Output when using NoMergePolicy.NO_COMPOUND_FILES
2. Output when using TieredMergePolicy with policy.setNoC
Hi,
I am trying to retrieve Terms for a given set of documents (int array or
Bitset), which is the result of a query.
// Index creation
// Query with an IndexSearcher
IndexSearcher searcher = new IndexSearcher(ir);
TopDocs docs = searcher.search(query, 100);
>From the "docs", an array of int c
On 04/29/2014 08:46 AM, Uwe Schindler wrote:
Hi Oliver,
To me it looks like you want to do it much too complicated. It also seems that
you misunderstood join queries, which seems to be your problem. Comments inside:
My lucene Index is built and stored in a zip file (uncompressed) which is use
Hi Rob,
While the demo code uses a fixed number of 3 values, you don't need to
encode the number of values up front. Since your read the byte[] of a
document up front, you can read in a while loop as long as in.position() <
in.length().
Shai
On Tue, Apr 29, 2014 at 10:04 AM, Rob Audenaerde
wrot
This really help ! I didn't know about MultiReader. This looks like
exactly what I need for 1 & 2
For 3. Remapping docIds would allow me to use them as ids for my data,
instead of having a stored field with my ids (which is usually the
official recommanded way to do this is lucene)
It may no
Hi Shai,
I read the article on your blog, thanks for it! It seems to be a natural fit to
do multi-values like this, and it is helpful indeed. For my specific problem, I
have multiple values that do not have a fixed number, so it can be either 0 or
10 values. I think the best way to solve this i
17 matches
Mail list logo