Re: encoding problem when retrieving document field value

2014-03-04 Thread G.Long
nt: Monday, March 3, 2014 12:09 PM To: java-user@lucene.apache.org Subject: encoding problem when retrieving document field value Hi :) My index (Lucene 3.5) contains a field called title. Its value is indexed (analyzed and stored) with the WhitespaceAnalyzer and can contains html entities such as

Re: encoding problem when retrieving document field value

2014-03-03 Thread Trejkaz
On Tue, Mar 4, 2014 at 4:44 AM, Jack Krupansky wrote: > What is the hex value for that second character returned that appears to > display as an apostrophe? Hex 92 (decimal 146) is listed as "Private Use > 2", so who knows what it might display as. Well, if they're dealing with HTML, then it wil

Re: encoding problem when retrieving document field value

2014-03-03 Thread Jack Krupansky
come about picking a PU Unicode character? -- Jack Krupansky -Original Message- From: G.Long Sent: Monday, March 3, 2014 12:09 PM To: java-user@lucene.apache.org Subject: encoding problem when retrieving document field value Hi :) My index (Lucene 3.5) contains a field called title. It

Re: encoding problem when retrieving document field value

2014-03-03 Thread G.Long
p://www.thetaphi.de eMail: u...@thetaphi.de -Original Message- From: G.Long [mailto:jde...@gmail.com] Sent: Monday, March 03, 2014 6:09 PM To: java-user@lucene.apache.org Subject: encoding problem when retrieving document field value Hi :) My index (Lucene 3.5) contains a field called title. It

RE: encoding problem when retrieving document field value

2014-03-03 Thread Uwe Schindler
M > To: java-user@lucene.apache.org > Subject: encoding problem when retrieving document field value > > Hi :) > > My index (Lucene 3.5) contains a field called title. Its value is indexed > (analyzed and stored) with the WhitespaceAnalyzer and can contains html > entities such as ’ or

encoding problem when retrieving document field value

2014-03-03 Thread G.Long
Hi :) My index (Lucene 3.5) contains a field called title. Its value is indexed (analyzed and stored) with the WhitespaceAnalyzer and can contains html entities such as ’ or ° My problem is that when i retrieve values from this field, some of the html entities are missing. For example : Lu