ls_text shouldn't be TCHAR?

(I'm asking other people reading this thread)

On Mon, Apr 26, 2010 at 9:58 AM, Rui Oliveira <ruifra...@hotmail.com> wrote:

>  void c_IndexEx::m_Add(CString avs_codRevsId)
>
> {
>
>        CString ls_origem = "c_IndexEx::m_Add";
>
>
>
>        try
>
>        {
>
>               m_InitVariables();
>
>
>
>               if(!ii_enmIndx)
>
>                      return;
>
>
>
>               IndexWriter* writer = NULL;
>
>               lucene::analysis::standard::StandardAnalyzer an;
>
>
>
>               if ( IndexReader::indexExists(iclp_indexPath) ){
>
>                      if ( IndexReader::isLocked(iclp_indexPath) )
>
>                      {
>
>                            m_AppendLog("Index was locked... unlocking
> it.");
>
>
>
>                            IndexReader::unlock(iclp_indexPath);
>
>                      }
>
>
>
>                      writer = _CLNEW IndexWriter( iclp_indexPath, &an,
> false);
>
>               }
>
>               else
>
>               {
>
>                      writer = _CLNEW IndexWriter( iclp_indexPath ,&an,
> true);
>
>               }
>
>
> writer->setMaxFieldLength(IndexWriter::DEFAULT_MAX_FIELD_LENGTH);
>
>               writer->setUseCompoundFile(true);
>
>
>
>               uint64_t str = lucene::util::Misc::currentTimeMillis();
>
>
>
>               // make a new, empty document
>
>               Document* lcl_doc = _CLNEW Document();
>
>               if(m_FileDocument( avs_codRevsId, lcl_doc ))
>
>               {
>
>                      writer->addDocument( lcl_doc );
>
>               }
>
>               _CLDELETE(lcl_doc);
>
>
>
>               writer->optimize();
>
>               writer->close();
>
>               _CLDELETE(writer);
>
>        }
>
>        catch(CLuceneError& err)
>
>        {
>
>        //     e->Delete();
>
>               return;
>
>        }
>
>        catch( CException* e )
>
>        {
>
>        //     e->Delete();
>
>               m_AppendLog(ls_origem);
>
>               return;
>
>        }
>
>        catch(...)
>
>        {
>
>        //     e->Delete();
>
>               return;
>
>        }
>
> }
>
>
>
> BOOL c_IndexEx::m_FileDocument(CString avs_codRevsId, Document* arcl_doc)
>
> {
>
>        // make a new, empty document
>
>        CString ls_codDocmId;
>
>        CString ls_Path = m_GetFilePath(avs_codRevsId, &ls_codDocmId);
>
>        if(ls_Path.IsEmpty())
>
>        {
>
>               return FALSE;
>
>        }
>
>        char* lcl_Path = NULL;
>
>        lcl_Path = new char[ls_Path.GetLength()+1];
>
>        _tcscpy(lcl_Path, ls_Path);
>
>
>
>        CString ls_text;
>
>        m_GetFileContents(lcl_Path, &ls_text);
>
>        arcl_doc->add( *_CLNEW Field(_T("contents"), ls_text,
> Field::STORE_YES | Field::INDEX_TOKENIZED) );
>
>
>
>        icl_file.m_DeleteFile(ls_Path);
>
>
>
>        // return the document
>
>        delete lcl_Path;
>
>        return TRUE;
>
> }
>
>
>
>
> ------------------------------
> From: oniltonmac...@gmail.com
> Date: Mon, 26 Apr 2010 10:36:45 -0300
>
> To: clucene-developers@lists.sourceforge.net
> Subject: Re: [CLucene-dev] Clucene search - Do not found some words
>
> Can you send the code where you index?
>
> On Mon, Apr 26, 2010 at 9:55 AM, Rui Oliveira <ruifra...@hotmail.com>wrote:
>
> How can I check this?
>
> I just get text from files to a CString, and after this put them in
> CLucene.
>
> Apparently, the text I get from file to CString it is right, I have checked
> in degub mode and looks good.
>
> Rui
>
>
>
> > Date: Mon, 26 Apr 2010 14:44:56 +0200
> > From: nuncupa...@googlemail.com
>
> > To: clucene-developers@lists.sourceforge.net
> > Subject: Re: [CLucene-dev] Clucene search - Do not found some words
> >
> > Rui,
> >
> > which encoding do you use internally before you give it to CLucene?
> > Maybe you use an encoding different to the encoding expected by
> > CLucene.
> >
> > Kind regards,
> >
> > Veit
> >
> > 2010/4/26 Rui Oliveira <ruifra...@hotmail.com>:
> > > Hi,
> > >
> > > I have been using luke to analyze index.
> > >
> > > Well, all Portuguese characters appear replaced by an strange
> character.
> > >
> > > What I can do to avoid this?
> > > It is not possible make clucene working with Portuguese characters?
> > >
> > > Thanks & Regards,
> > > Rui
> > >
> > >
> > >
> > >> Date: Fri, 23 Apr 2010 20:43:49 +0200
> > >> From: bvanklin...@gmail.com
> > >> To: clucene-developers@lists.sourceforge.net
> > >> Subject: Re: [CLucene-dev] Clucene search - Do not found some words
> > >>
> > >> I suggest using a program called luke (google it). You can then look
> > >> into the index and see what is indexed. Let us know if u see all the
> > >> words you would expect to see. And see if u can find the document if u
> > >> search from luke
> > >>
> > >> handy program :)
> > >>
> > >> cheers
> > >> ben
> > >>
> > >> On Friday, April 23, 2010, Rui Oliveira <ruifra...@hotmail.com>
> wrote:
> > >> >
> > >> >
> > >> >
> > >> >
> > >> >
> > >> > Itamar,
> > >> >
> > >> > The test results are made all them in same file. The same file have
> > >> > "orçamento" and "administração" and found "administração" and do not
> found
> > >> > "orçamento".
> > >> >
> > >> > The results are the same for a file in ANSI, Unicode or UTF8
> encoded.
> > >> > The problem is not loading files because I debug the text loaded
> from file
> > >> > and this text are ok.
> > >> >
> > >> > Rui
> > >> >
> > >> >
> > >> >
> > >> >
> > >> > From: ita...@divrei-tora.com
> > >> > To: clucene-developers@lists.sourceforge.net
> > >> > Date: Fri, 23 Apr 2010 17:59:27 +0300
> > >> > Subject: Re: [CLucene-dev] Clucene search - Do not found some words
> > >> >
> > >> > Rui,
> > >> >
> > >> > This file is ANSI encoded. Are the other files you do succeed in
> finding
> > >> > are Unicode / UTF8 encoded perhaps? If that's the case your routine
> for
> > >> > loading the files is buggy. You should either have them all encoded
> using
> > >> > the same encoding, or have more intelligent code to convert
> incompatible
> > >> > encoding.
> > >> >
> > >> > HTH
> > >> >
> > >> > Itamar.
> > >> >
> > >> >
> > >> > From: Rui Oliveira [mailto:ruifra...@hotmail.com]
> > >> > Sent: Friday, April 23, 2010 4:32 PM
> > >> > To: clucene-developers; oniltonmac...@gmail.com
> > >> > Subject: Re: [CLucene-dev] Clucene search - Do not found some words
> > >> >
> > >> >
> > >> > I just attach the file.
> > >> >
> > >> > Tks, Rui
> > >> >
> > >> >
> > >> > From: oniltonmac...@gmail.com
> > >> > Date: Fri, 23 Apr 2010 09:22:05 -0400
> > >> > To: clucene-developers@lists.sourceforge.net
> > >> > Subject: Re: [CLucene-dev] Clucene search - Do not found some words
> > >> >
> > >> > Can you send me this file that has both "orçamento" and
> administração?
> > >> >
> > >> > Or you can do a test: Open the file and delete the ç form orçamento
> and
> > >> > administração.
> > >> > And then type ç again.
> > >> >
> > >> > Index again and try to search both words again.
> > >> >
> > >> > On Fri, Apr 23, 2010 at 9:14 AM, Rui Oliveira <
> ruifra...@hotmail.com>
> > >> > wrote:
> > >> >
> > >> > They are text file (*.txt) and both words are in same document.
> > >> > When I search for "orçamento" don't found anything and when I search
> for
> > >> > "administração" the document is found.
> > >> >
> > >> >
> > >> > Rui
> > >> >
> > >> >
> > >> > From: oniltonmac...@gmail.com
> > >> > Date: Fri, 23 Apr 2010 09:09:30 -0400
> > >> >
> > >> >
> > >> >
> > >> > To: clucene-developers@lists.sourceforge.net
> > >> > Subject: Re: [CLucene-dev] Clucene search - Do not found some words
> > >> >
> > >> > Seems like an encoding problem with these documents. Are they html
> > >> > pages?
> > >> > Are the words "orçamento" and "administração" in the same page? for
> > >> > example?
> > >> >
> > >> > Can you dump one of these files here? (One that has the problem and
> one
> > >> > that has not)
> > >> >
> > >> >
> > >> > On Fri, Apr 23, 2010 at 9:05 AM, Rui Oliveira <
> ruifra...@hotmail.com>
> > >> > wrote:
> > >> >
> > >> > I am indexing some separated documents.
> > >> >
> > >> > The document that have these words are a small text document. This
> > >> > document is indexed without any visible error. This same document is
> found
> > >> > when I search for other words on it.
> > >> >
> > >> >
> > >> > Rui
> > >> >
> > >> >
> > >> > From: oniltonmac...@gmail.com
> > >> > Date: Fri, 23 Apr 2010 08:58:05 -0400
> > >> >
> > >> >
> > >> >
> > >> > To: clucene-developers@lists.sourceforge.net
> > >> > Subject: Re: [CLucene-dev] Clucene search - Do not found some words
> > >> >
> > >> > What are you indexing?
> > >> >
> > >> > Just a big document?
> > >> > Or a lot of sepparate documents ? (html documents?)
> > >> >
> > >> > On Fri, Apr 23, 2010 at 8:54 AM, Rui Oliveira <
> ruifra...@hotmail.com>
> > >> > wrote:
> > >> >
> > >> > Hi Onilton,
> > >> >
> > >> > I have tested with "orcamento" instead of "orçamento" and didn't get
> > >> > anything.
> > >> >
> > >> > I do not know if lucene indexes "orçamento" in a wrong way, because
> > >> > indexes without any error, but when I search for it do not get
> anything.
> > >> >
> > >> > Thnaks & Regards,
> > >> > Rui
> > >> >
> > >> >
> > >> > From:
> > >> >
> > >>
> > >>
> > >>
> ------------------------------------------------------------------------------
> > >> _______________________________________________
> > >> CLucene-developers mailing list
> > >> CLucene-developers@lists.sourceforge.net
> > >> https://lists.sourceforge.net/lists/listinfo/clucene-developers
> > >
> > > ________________________________
> > > Hotmail has tools for the New Busy. Search, chat and e-mail from your
> inbox.
> > > Learn more.
> > >
> ------------------------------------------------------------------------------
> > >
> > > _______________________________________________
> > > CLucene-developers mailing list
> > > CLucene-developers@lists.sourceforge.net
> > > https://lists.sourceforge.net/lists/listinfo/clucene-developers
> > >
> > >
> >
> >
> ------------------------------------------------------------------------------
> > _______________________________________________
> > CLucene-developers mailing list
> > CLucene-developers@lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/clucene-developers
>
>  ------------------------------
> The New Busy think 9 to 5 is a cute idea. Combine multiple calendars with
> Hotmail. Get 
> busy.<http://www.windowslive.com/campaign/thenewbusy?tile=multicalendar&ocid=PID28326::T:WLMTAGL:ON:WL:en-US:WM_HMP:042010_5>
>
>
> ------------------------------------------------------------------------------
>
> _______________________________________________
> CLucene-developers mailing list
> CLucene-developers@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/clucene-developers
>
>
>
> ------------------------------
> The New Busy think 9 to 5 is a cute idea. Combine multiple calendars with
> Hotmail. Get 
> busy.<http://www.windowslive.com/campaign/thenewbusy?tile=multicalendar&ocid=PID28326::T:WLMTAGL:ON:WL:en-US:WM_HMP:042010_5>
>
>
> ------------------------------------------------------------------------------
>
> _______________________________________________
> CLucene-developers mailing list
> CLucene-developers@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/clucene-developers
>
>
------------------------------------------------------------------------------
_______________________________________________
CLucene-developers mailing list
CLucene-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/clucene-developers

Reply via email to