Thank you for bringing closure. Mike McCandless
http://blog.mikemccandless.com On Wed, Sep 28, 2016 at 11:56 AM, Jérémy GUYENOT <jguye...@efalia.com> wrote: > Hi Michael, > > > > I just find my problem. Du to a Lucene problem that it index “abcd.” Like > a word we added into our code a regex to add space between “abcd” and “.” > (or punctuation caracters). > > > > So I update this regex and it wxorks fine. > > > > The code before: > > // Add space between word and punctuation caracters > > String pattern = "(\\w)([\\.,;\\?!:])"; > > contents = contents.replaceAll(pattern, "$1 $2"); > > > > The code after: > > // Not taking into account the figures if the amounts will be cut > > // REGEX: all words ([a-zA-Z0-9]) followed by,;.? but not > immediately followed by punctuation > > String pattern = "(\\w)([\\.,;\\?!:])(?!(\\s*[0-9]))"; > > contents = contents.replaceAll(pattern, "$1 $2"); > > > > Thanks a lot for your time. > > > > Good bye > > > > *Jérémy GUYENOT | *Responsable service R&D > *jguye...@efalia.com <jguye...@efalia.com>* > > ——————————— > > 49, av. de la République 69200 Vénissieux | Tél : 04 72 51 77 55 | Fax : > 04 72 50 43 13 > *WWW.EFALIA.COM* <http://www.efalia.com/> > [image: cid:image003.jpg@01D0B342.03D15BD0] <http://www.efalia.com/> > > *Pour assurer un suivi technique de vos demandes veuillez passer par **Mantis > <http://feqa.communauteged-multigest.fr/>** notre outil en ligne.* > > > > P *Eco-responsabilité, n'imprimez ce mail que si nécessaire* > > > > *De :* Michael McCandless [mailto:luc...@mikemccandless.com] > *Envoyé :* mardi 27 septembre 2016 16:19 > *À :* Lucene Users <java-user@lucene.apache.org>; Jérémy GUYENOT < > jguye...@efalia.com> > *Cc :* Jan Høydahl <jan....@cominvent.com> > > *Objet :* Re: Research problems on numeric values into text (with. or,) > > > > Possibly you are using an analyzer that does not preserve decimal numbers > as a single token? Or, you are using a different analyzer at indexing time > vs search time? > > > > Can you make a small test case showing the issue? > > > Mike McCandless > > http://blog.mikemccandless.com > > > > On Tue, Sep 27, 2016 at 3:06 PM, Jérémy GUYENOT <jguye...@efalia.com> > wrote: > > Hello, > > > > Sorry for this multi post but my first post was without answers so I try > another way. > > > > *What are you indexing?* > > I wish to index files such as that present in the "ZIP \ file" folder, > which contains decimal data (with. Or, as decimal separator). > > > > *How are you searching, and what did you expect to find?* > > I want to be able to search decimals because our tools stock large > quantities of such documents (eg invoices, quotes, orders). > > > > *What do you actually see and why is that a problem?* > > The search for the number 404 returns files containing 404. > The search for the number 50 returns files containing 50. > The search for the number 404.50 returns no results. > > The text content was store in a TextField with Field.Store.NO. > > I try some of Analysers but the result is the same. I also try with 4.3.1 > and 6.2.0 of lucene but the same. > > > > I wish you can give me some details to search decimals values into text > files. > > > > In the zip you can find: > > - File > > o The file example containing decimals values > > - Index > > o The files of Lucene indexation > > - Indexationlucene > > o The code that we have to index file from our app > > - RechercheLucene > > o The code that we have to search into our app > > > > Cordially > > > > *Jérémy GUYENOT | *Responsable service R&D > *jguye...@efalia.com <jguye...@efalia.com>* > > ——————————— > > 49, av. de la République 69200 Vénissieux | Tél : 04 72 51 77 55 | Fax : > 04 72 50 43 13 > *WWW.EFALIA.COM* <http://www.efalia.com/> > [image: cid:image003.jpg@01D0B342.03D15BD0] <http://www.efalia.com/> > > *Pour assurer un suivi technique de vos demandes veuillez passer par **Mantis > <http://feqa.communauteged-multigest.fr/>** notre outil en ligne.* > > > > P *Eco-responsabilité, n'imprimez ce mail que si nécessaire* > > > > *De :* Jan Høydahl [mailto:jan....@cominvent.com] > *Envoyé :* mardi 27 septembre 2016 10:20 > *À :* java-user@lucene.apache.org > *Cc :* Jérémy GUYENOT <jguye...@efalia.com> > *Objet :* Re: Research problems on numeric values into text (with. or,) > > > > Please do not cross-post to multiple mailing lists. > > This belongs to java-user only. > > It is also generally better to describe the problem in more detail in the > mail, than attaching a zip. > > - What are you indexing > > - How are you searching, and what did you expect to find > > - What do you actually see and why is that a problem? > > > > -- > Jan Høydahl, search solution architect > Cominvent AS - www.cominvent.com > > > > 27. sep. 2016 kl. 10.15 skrev Jérémy GUYENOT <jguye...@efalia.com>: > > > > Hello, > > we find research problems on numeric values into text (with. or,). Unable > to search 315.86 or 315.86. > > We try custom Analysers without success either. > > I enclose the code used to index and one to do the research. > > I do not know if this is a bug on your side or problem Analyze of ours. > > The problem is the same between version 4.3.1 and 6.2.0. > > Thank you in advance for your quick return. > > cordially > > <LUCENE.zip> > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > > >