Can you create a scoring scenario that counts the number of fields in
which a term occurs and rank by that (descending) with some kind of
post-filtering?
On Fri, Apr 19, 2019 at 11:24 AM Valentin Popov wrote:
>
> Hi,
> I trying find the way, to search all docs has equals term on different
> field
It is not possible, because eliminate flexibility of fields I need search
for using old data with out reindexing.
Thanks.
сб, 20 апр. 2019 г. в 03:12, Tomoko Uchida :
> Hi,
>
> I'm not sure there are better ways to meet your requirement by
> querying, but how about considering static approaches?
Hi,
I'm not sure there are better ways to meet your requirement by
querying, but how about considering static approaches?
I would index an auxiliary field which has binary values (0/1 or
"T"/"F") representing "has equals term on different fields"
so that you can filtering out the docs (maybe by co
I explain better i created a my class MyStoredField.java for saving a
value.
Reading for understading if it is a normal stored field or not i used
"instanceof".
But in debug the class generated when i read the document id StoredField
not MyStoredField . No way for doing it?
2016-09-07 16:
I don't think that you need to be concerned with the internal docIDs much.
Just imagine the indexes as a big table with multiple columns, where
columns are grouped together. Each group is a different index. If a
document does not have a value in one column, then you have an empty cell.
if a documen
On 05/02/2014 06:05 AM, Shai Erera wrote:
If you're always rebuilding, let alone forceMerge, you shouldn't have too
much trouble implementing it. Just make sure that you add documents in the
same order to all indexes.
If you're always rebuilding, how come you have deletions? Anyway, you must
als
If you're always rebuilding, let alone forceMerge, you shouldn't have too
much trouble implementing it. Just make sure that you add documents in the
same order to all indexes.
If you're always rebuilding, how come you have deletions? Anyway, you must
also delete in all indexes.
On May 2, 2014 1:57
On 05/01/2014 10:28 AM, Shai Erera wrote:
I'm glad it helped you. Good luck with the implementation.
Thanks. First I started looking at the lucene internal code. To
understand when/where and why docIds are changing/need to be changed (in
merge and doc deletions) .
I have always wanted to unde
I'm glad it helped you. Good luck with the implementation.
One thing I didn't mention (though it's in the jdocs) -- it's not enough to
have the documents of each index aligned, you also have to have the
segments aligned. That is, if both indexes have documents 0-5 aligned, but
one index contains a
On 04/30/2014 10:48 AM, Shai Erera wrote:
I hope I got all the details right, if I didn't then please clarify. Also,
I haven't read the entire thread, so if someone already suggested this ...
well, it probably means it's the right solution :)
It sounds like you could use Lucene's ParallelComposi
I hope I got all the details right, if I didn't then please clarify. Also,
I haven't read the entire thread, so if someone already suggested this ...
well, it probably means it's the right solution :)
It sounds like you could use Lucene's ParallelCompositeReader, which
already handles multiple Ind
My suggestion is you not worry about the docId, in practice it is an
"internal lucene" id, quite similar with a rowId on a database, each index
may generate a different docId (it is their problem) from a translated
document, you may use your own ID that relates one document to another on
different
On 04/29/2014 08:46 AM, Uwe Schindler wrote:
Hi Oliver,
To me it looks like you want to do it much too complicated. It also seems that
you misunderstood join queries, which seems to be your problem. Comments inside:
My lucene Index is built and stored in a zip file (uncompressed) which is use
This really help ! I didn't know about MultiReader. This looks like
exactly what I need for 1 & 2
For 3. Remapping docIds would allow me to use them as ids for my data,
instead of having a stored field with my ids (which is usually the
official recommanded way to do this is lucene)
It may no
Hi Oliver,
To me it looks like you want to do it much too complicated. It also seems that
you misunderstood join queries, which seems to be your problem. Comments inside:
> My lucene Index is built and stored in a zip file (uncompressed) which is used
> as a read-only Directory.
>
> 1) At lucen
>
>
> -Original Message-
> From: Ian Lea [mailto:ian@gmail.com]
> Sent: Thursday, February 17, 2011 5:52 PM
> To: java-user@lucene.apache.org
> Subject: Re: fields : stored and indexed
>
> http://lucene.apache.org/java/3_0_3/fileformats.html will tell you
]
Sent: Thursday, February 17, 2011 5:52 PM
To: java-user@lucene.apache.org
Subject: Re: fields : stored and indexed
http://lucene.apache.org/java/3_0_3/fileformats.html will tell you all
you need to know about what is stored where and how.
In general, the speed of searching i.e. finding matching
http://lucene.apache.org/java/3_0_3/fileformats.html will tell you all
you need to know about what is stored where and how.
In general, the speed of searching i.e. finding matching docs will not
be affected by the number of stored fields but retrieving data from
lots of stored fields will certainl
"if you do not have access to the original contents" is the key if Uwe's
comment. You do not need a separate field at all, it all depends upon
your situation. There's no problem in indexing AND storing f field.
HTH
Erick
On Sun, Aug 29, 2010 at 11:33 PM, Constantine Vetoshev
wrote:
> "Uwe Schind
"Uwe Schindler" writes:
> You cannot retrieve non-stored fields. They are analyzed and tokenized
> during indexing and this is a one-way transformation. If you update
> documents you have to reindex the contents. If you do not have access to the
> original contents anymore, you may consider adding
we Schindler
> H.-H.-Meier-Allee 63, D-28213 Bremen
> http://www.thetaphi.de
> eMail: u...@thetaphi.de
>
>
> > -Original Message-
> > From: Constantine Vetoshev [mailto:gepar...@gmail.com]
> > Sent: Sunday, August 29, 2010 10:38 PM
> > To: java-user@l
e-
> From: Constantine Vetoshev [mailto:gepar...@gmail.com]
> Sent: Sunday, August 29, 2010 10:38 PM
> To: java-user@lucene.apache.org
> Subject: Re: Fields with Field.Store.NO and Field.Index.ANALYZED not being
> indexed
>
> Thanks Erick.
>
> I finally had tim
Thanks Erick.
I finally had time to go back and look at this problem. I discovered
that the analyzed fields work fine for searching until I use
IndexWriter.updateDocument().
The way my application runs, it has to update documents several times to
update one specific field. The update code queries
I would be extraordinarily surprised if this was in Lucene, this is so
basic to how it works that the howls would be heard world-round .
So I'm guessing it's in your code. Could you show it to us? Or, better
yet, create a small, self-contained test case that illustrates your problem?
Also, what a
=saic@lucene.apache.org
[mailto:java-user-return-45558-paul.b.murdoch=saic@lucene.apache.org
] On Behalf Of Erick Erickson
Sent: Wednesday, March 24, 2010 4:28 PM
To: java-user@lucene.apache.org
Subject: Re: Fields with the same name
I don't think so, but a quick way to check would
I don't think so, but a quick way to check would be to look at your
index with a copy of Luke and see what the actual tokens are.
But I'm not sure it matters, I don't think you *can* make things work
out well; your query-time analysis will be...er...difficult. You only
get to specify one analyzer
Thanks, Mike -- that makes sense. Yes, the fields would be known in advance
so the codec would know to ignore them at index time.
Thanks,
Chris
Lucene doesn't optimize for this today.
But with flex (still on branch but hopefully landing on trunk soon)
you could impl a codec that did optimize such fields. You would know,
in advance, which field(s) do this, right?
And at searching time the codec would pretend all docs appeared in the
post
I'll think about your ideas and see which one works best for my project. Thank
you.
> Date: Wed, 11 Feb 2009 18:11:58 -0700
> Subject: Re: Fields with multiple values...
> From: mark.a.fergu...@gmail.com
> To: java-user@lucene.apache.org
>
> One approach is to use dyn
One approach is to use dynamic fields, making the value of the second field
part of the name of the first field. So for example, you would have:
doc.Add (new Field ("Field1_A", "C", Field.Store.YES,
Field.Index.UN_TOKENIZED));
doc.Add (new Field ("Field1_B", "D", Field.Store.YES,
Field.Index.UN_TO
Dragon Fly wrote:
I'd like to get a hit if I do:
Field1:A AND Field2:C
This is fine because that's how Lucene works. However, I do not want to get a
hit if I do:
Field1:A AND Field2:D
The reason that I don't want a hit is because A is the first element in Field1
and D is the second el
Well, you could index with your index as part of the value...
doc.Add (new Field ("Field1", "1A", Field.Store.YES,
Field.Index.UN_TOKENIZED));
doc.Add (new Field ("Field1", "2B", Field.Store.YES,
Field.Index.UN_TOKENIZED));
// Add 2 values to Field2.
doc.Add (new Field ("Field2", "1C", Field.Store
> But you'd have to do result consolidation.
That's what I'm trying to avoid. I could get a lot of hits (e.g. 100,000 hits)
and will have to load all the documents to remove the duplicates.
> Subject: RE: Fields with multiple values...
> Date: Wed, 11 Feb 2009 18
Hi Dragon Fly,
You could split the original document into multiple Lucene Documents,
one for each array index, all sharing the same "DocID" field value.
Then your queries "just work". But you'd have to do result
consolidation, removing duplicate original docs when you get matches at
multiple arr
Yes, assuming as I pointed out that your input string had
whitespace between "bar1" and "bar2" in your first example...
Erick
On Wed, Oct 15, 2008 at 2:32 PM, Rafael Almeida <[EMAIL PROTECTED]>wrote:
> On Wed, Oct 15, 2008 at 4:22 PM, Erick Erickson <[EMAIL PROTECTED]>
> wrote:
> > Yes, in terms
On Wed, Oct 15, 2008 at 4:22 PM, Erick Erickson <[EMAIL PROTECTED]> wrote:
> Yes, in terms of what you probably mean, but your first
> example would index one token "bar1bar2". But if you
> changed your first example to (note space): they would
> be entirely equivalent.
>
> doc.add(new Field("foo
Yes, in terms of what you probably mean, but your first
example would index one token "bar1bar2". But if you
changed your first example to (note space): they would
be entirely equivalent.
doc.add(new Field("foo",
"bar1 bar2",
Field.Store
On Tue, Aug 19, 2008 at 2:15 AM, Antony Bowesman <[EMAIL PROTECTED]> wrote:
>
> Thanks for you time and I appreciate your valuable insight Doron.
> Antony
>
I'm glad I could help!
Doron
Doron Cohen wrote:
The API definitely doesn't promise this.
AFAIK implementation wise it happens to be like this but I can be wrong and
plus it might change in the future. It would make me nervous to rely on
this.
I made some tests and it 'seems' to work, but I agree, it also makes me nervous
>
> payload and the other part for storing, i.e. something like this:
>>
>>Token token = new Token(...);
>>token.setPayload(...);
>>SingleTokenTokenStream ts = new SingleTokenTokenStream(token);
>>
>>Field f1 = new Field("f","some-stored-content",Store.YES,Index.NO);
>>Field f2
hen indexing a word to the default field from this:
Doc.Add(Field.Text("album", Album));
Cheers
Sachin
-Original Message-
From: Erick Erickson [mailto:[EMAIL PROTECTED]
Sent: 19 February 2007 16:05
To: java-user@lucene.apache.org
Subject: Re: Fields
See below.
On 2/19/07,
---
From: Erick Erickson [mailto:[EMAIL PROTECTED]
Sent: 19 February 2007 16:05
To: java-user@lucene.apache.org
Subject: Re: Fields
See below.
On 2/19/07, Kainth, Sachin <[EMAIL PROTECTED]> wrote:
>
> Hi all,
>
> I have a few question regarding indexing documents.
>
> 1. W
See below.
On 2/19/07, Kainth, Sachin <[EMAIL PROTECTED]> wrote:
Hi all,
I have a few question regarding indexing documents.
1. With my experience of indexing documents with lucene so far I have
done things like:
Doc.Add(Field.Text("album", Album));
Where Album is a string representing an a
: I have a field called "location" on my index. For example, this string: "A
: B" "A C" D was stored on my index
: When I search for "location: ", these are the results that I'd like to
: retrieve:
: 1) location: D -- 1 hit
: 2) location: A -- no hits
: 3) location: "A B" -- 1 hit
: 4) location:
I know of no way of doing this with the standard analyzers, unless you do
some fooling around..
I think you'd have to write your own analyzer/tokenizer that you use both at
indexing time and query parsing time that broke the input streams up the way
you want. In this case, A B would be a SINGLE t
On Jun 14, 2005, at 12:46 PM, Peter A. Friend wrote:
From reading the docs, it appears that Lucene supports addition of
new fields over time to an existing index. This is very handy for
folks whose indexing requirements change over time. My question has
to do with a change in an existing f
Chuck Williams wrote:
Omar Didi writes (4/20/2005 5:05 PM):
Hi guys,
If a field is indexed as UnStored how can I get it value?
I tried document.get("UnStored_field") it returns null.
You didn't store it, so it's not there. If the field happens to be a
single Term, you might be able to find it
Omar Didi writes (4/20/2005 5:05 PM):
Hi guys,
If a field is indexed as UnStored how can I get it value?
I tried document.get("UnStored_field") it returns null.
You didn't store it, so it's not there. If the field happens to be a
single Term, you might be able to find it in the index, expensiv
Peter Veentjer - Anchor Men wrote:
I have question about field boosting.
If I have 2 (or more) fields with the same fieldname in a single
document, and I boost one of those, than only that one will be boosted?
Or will all fields with the same name be boosted? I guess only one field
is boosted, bu
Le 15 avr. 05, à 14:44, Peter Veentjer - Anchor Men a écrit :
I have question about field boosting.
If I have 2 (or more) fields with the same fieldname in a single
document, and I boost one of those, than only that one will be boosted?
Or will all fields with the same name be boosted? I guess only
50 matches
Mail list logo