RE: Unicode Character Problem

2016-12-12 Thread Allison, Timothy B.
ing%20Tika#PDF_Text_Problems -Original Message- From: Furkan KAMACI [mailto:furkankam...@gmail.com] Sent: Monday, December 12, 2016 10:55 AM To: solr-user@lucene.apache.org; Ahmet Arslan Subject: Re: Unicode Character Problem Hi Ahmet, I don't see any weird character when I manual copy it to

Re: Unicode Character Problem

2016-12-12 Thread Furkan KAMACI
Hi Ahmet, I don't see any weird character when I manual copy it to any text editor. On Sat, Dec 10, 2016 at 6:19 PM, Ahmet Arslan wrote: > Hi Furkan, > > I am pretty sure this is a pdf extraction thing. > Turkish characters caused us trouble in the past during extracting text > from pdf files.

Re: Unicode Character Problem

2016-12-10 Thread Ahmet Arslan
Hi Furkan, I am pretty sure this is a pdf extraction thing. Turkish characters caused us trouble in the past during extracting text from pdf files. You can confirm by performing manual copy-paste from original pdf file. Ahmet On Friday, December 9, 2016 8:44 PM, Furkan KAMACI wrote: Hi, I'm

Unicode Character Problem

2016-12-09 Thread Furkan KAMACI
Hi, I'm trying to index Turkish characters. These are what I see at my index (I see both of them at different places of my content): aç �klama açıklama These are same words but indexed different (same weird character at first one). I see that there is not a weird character when I check the origi