nt: Monday, March 3, 2014 12:09 PM
To: java-user@lucene.apache.org
Subject: encoding problem when retrieving document field value
Hi :)
My index (Lucene 3.5) contains a field called title. Its value is
indexed (analyzed and stored) with the WhitespaceAnalyzer and can
contains html entities such as
On Tue, Mar 4, 2014 at 4:44 AM, Jack Krupansky wrote:
> What is the hex value for that second character returned that appears to
> display as an apostrophe? Hex 92 (decimal 146) is listed as "Private Use
> 2", so who knows what it might display as.
Well, if they're dealing with HTML, then it wil
come about picking a PU Unicode
character?
-- Jack Krupansky
-Original Message-
From: G.Long
Sent: Monday, March 3, 2014 12:09 PM
To: java-user@lucene.apache.org
Subject: encoding problem when retrieving document field value
Hi :)
My index (Lucene 3.5) contains a field called title. It
p://www.thetaphi.de
eMail: u...@thetaphi.de
-Original Message-
From: G.Long [mailto:jde...@gmail.com]
Sent: Monday, March 03, 2014 6:09 PM
To: java-user@lucene.apache.org
Subject: encoding problem when retrieving document field value
Hi :)
My index (Lucene 3.5) contains a field called title. It
M
> To: java-user@lucene.apache.org
> Subject: encoding problem when retrieving document field value
>
> Hi :)
>
> My index (Lucene 3.5) contains a field called title. Its value is indexed
> (analyzed and stored) with the WhitespaceAnalyzer and can contains html
> entities such as ’ or
Hi :)
My index (Lucene 3.5) contains a field called title. Its value is
indexed (analyzed and stored) with the WhitespaceAnalyzer and can
contains html entities such as ’ or °
My problem is that when i retrieve values from this field, some of the
html entities are missing.
For example :
Lu