Hi Shai,

Thank you very much, I have succeeded with Solr to index and run.

But actually, I expected that I can import Lucene as a library (I am not
Java expert, more familiar with C/C++) and call some Lucene functions.

Could you give me a URL tutorial for Lucene 4 which is useful for Java
newbie?



----------------------------------
Thanks and Best Regards

Vinh Dang (Msc.)
Project Manager
FPT Software
Mobile: +84 982 058 956
Skype:  dqvinh87
Y!M:    dqvinh87
Email: dqvin...@gmail.com
Websites: http://www.vinhdq.blogspot.com


On Tue, Jul 9, 2013 at 7:26 AM, Shai Erera <ser...@gmail.com> wrote:

> Well ... at a high level, this is what you should do:
>
>
>    1. Integrate with Apache Tika for parsing the .DOC files (and maybe
>    other office files you have)
>    2. Tika extracts the contents of the document, as well as some metadata
>    3. Create a Lucene Document object to which you add Fields:
>       1. TextField for e.g. the "content" field
>       2. StringField for e.g. the path to the document on the file system
>       3. NumericDocValuesField for e.g. the documents modification date
>       4. Perhaps another StringField for the documents type (Word,
>       PowerPoint)
>    4. Index these documents with IndexWriter
>    5. Search using IndexSearcher
>
> I'm sure there's a lot of Lucene tutorials around, for example:
> http://www.lucenetutorial.com/lucene-in-5-minutes.html. Covers pretty much
> what I've mentioned above.
>
> From there, you can expand to add search results highlighting (summaries /
> snippets) using e.g. PostingsHighlighter, faceted search using Lucene
> facets, Spelling correction and more.
>
> Also, are you aware of Solr, which is a search engine developed on top of
> Lucene. It takes care of all that for you, and has some pretty good
> tutorials and documentation.
> If you're not aiming to do something very challenging with these documents,
> I think Solr can help you set up search very quickly, without writing any
> code.
>
> Shai
>
>
> On Tue, Jul 9, 2013 at 2:44 AM, Vinh Dang <dqvin...@gmail.com> wrote:
>
> > Sorry for my typo,
> >
> > I mean Lucene 4.3.1,
> >
> > Thank Beale from US for that :)
> >
> > ---
> > Best Regards
> > Vinh Dang
> > dqvin...@gmail.com
> >
> >
> >
> >
> > On Jul 8, 2013, at 9:46 PM, Vinh Dang <dqvin...@gmail.com> wrote:
> >
> > > Hi everyone,
> > >
> > > I am very new in Lucene, so please forgive me if my question is quite
> > stupid.
> > >
> > > I spent a whole day to google how to start with Lucene 4.6.1, but
> > failed. I found some clear tutorials, but they were written for too old
> > Lucene versions (almost 2).
> > >
> > > My tasks are:
> > > I have a folder which contains multiple .DOC files, with Unicode
> > characters (actually, they are Vietnamese characters).
> > > I want to index this folder with Lucene (4.6.1 is the best, but another
> > versions is OK).
> > >
> > > Could you give a point to start?
> > >
> > > Thank you very much,
> > >
> > > ---
> > > Best Regards
> > > Vinh Dang
> > > dqvin...@gmail.com
> > >
> > >
> > >
> > >
> >
> >
>

Reply via email to