Hi,
The documentation states "for the purposes of search". It will be
indexed as two non-analyzed tokens with default a default
PositionIncrementAttribute of 1. This would result in a document with
the tokens [ "ABC-12345", "12345" ], something a PhraseQuery for
"ABC-12345 12345" would match.
I believe you would need to write a custom analyzer and override
GetPositionIncrementGap to overcome this, but that route could just as
easily end up with an analyzer that parses your file numbers into the
previously mentioned tokens with a PositionIncrementAttribute value of 0
(for the second token) which would mean that they are considered to be
at the same position.
// Simon
On 2012-01-18 20:57, Brian Sayatovic wrote:
I considered that, but the documentation for Document.Add(...) scared me off:
" Adds a field to a document. Several fields may be added with the same name. In
this case, if the fields are indexed, their text is treated as though appended for the
purposes of search."
Would that mean that the FileNumber field of my document would become
"ABC-1234512345"?
Regards,
Brian.
Brian Sayatovic
Senior Software Architect
866 218 1003 toll-free ext. 8936
937-235-8936 office
4540 Honeywell Ct. Dayton, OH 45424
This message may contain confidential/proprietary information from the CINgroup
or its subsidiaries.
If you are not an intended recipient, please refrain from the disclosure,
copying, distribution or use of this information. All such unauthorized actions
are strictly prohibited. If you have received this transmission in error,
please notify the sender by e-mail and delete all copies of this material from
any computer.
-----Original Message-----
From: Simon Svensson [mailto:si...@devhost.se]
Sent: Wednesday, January 18, 2012 2:06 PM
To: lucene-net-user@lucene.apache.org
Subject: Re: [Lucene.Net] How best to leverage Lucene to index my field?
Importance: Low
Hi,
You could accomplish this by adding several FileNumber fields. I'm guessing
that a regexp would suffice to extract the number from the complete value.
var document = new Document();
document.Add(new Field("FileNumber", "ABC-12345", Field.Store.NO, Field.Index.NOT_ANALYZED));
document.Add(new Field("FileNumber", "12345", Field.Store.NO, Field.Index.NOT_ANALYZED));
// Simon
On 2012-01-18 17:36, Brian Sayatovic wrote:
I have some data (files) that have prominent identifiers (file
numbers) that users often know the files by. File numbers are in the
form of "[group]-[number_within_region]". For example, "ABC-12345"
and "XYZ-12345". Today, I add a non-analyzed Field named "FileNumber"
with that full value.
However, while some users often work across many group, most users
search within a particular group. Therefore most users are bothered
by having to enter their group prefix when searching. XYZ users would
prefer to enter just "12345" instead of "XYZ-12345".
How can I make it so users can search by just the suffix (e.g.
"12345", to find both "ABC-12345" and "XYZ-12345"), or the full file
number? It seems the StandardAnalyzer doesn't break terms on hyphens.
Regards,
Brian.
*Brian Sayatovic*
Senior Software Architect
866 218 1003 toll-free ext. 8936
937-235-8936 office
4540 Honeywell Ct. Dayton, OH 45424
The CINgroup
Facebook
<http://www.facebook.com/pages/The-CINgroup/161740787235897><http://ht
tps://twitter.com/theCINgroup>
This message may contain confidential/proprietary information from the
CINgroup or its subsidiaries.
If you are not an intended recipient, please refrain from the
disclosure, copying, distribution or use of this information. All such
unauthorized actions are strictly prohibited. If you have received
this transmission in error, please notify the sender by e-mail and
delete all copies of this material from any computer.