I did: http://www.lucenetutorial.com/lucene-in-5-minutes.html
Also take a look at our demo code: http://lucene.apache.org/core/4_3_1/demo/src-html/org/apache/lucene/demo/IndexFiles.html . Shai On Tue, Jul 9, 2013 at 5:57 AM, Vinh Đặng <dqvin...@gmail.com> wrote: > Hi Shai, > > Thank you very much, I have succeeded with Solr to index and run. > > But actually, I expected that I can import Lucene as a library (I am not > Java expert, more familiar with C/C++) and call some Lucene functions. > > Could you give me a URL tutorial for Lucene 4 which is useful for Java > newbie? > > > > ---------------------------------- > Thanks and Best Regards > > Vinh Dang (Msc.) > Project Manager > FPT Software > Mobile: +84 982 058 956 > Skype: dqvinh87 > Y!M: dqvinh87 > Email: dqvin...@gmail.com > Websites: http://www.vinhdq.blogspot.com > > > On Tue, Jul 9, 2013 at 7:26 AM, Shai Erera <ser...@gmail.com> wrote: > > > Well ... at a high level, this is what you should do: > > > > > > 1. Integrate with Apache Tika for parsing the .DOC files (and maybe > > other office files you have) > > 2. Tika extracts the contents of the document, as well as some > metadata > > 3. Create a Lucene Document object to which you add Fields: > > 1. TextField for e.g. the "content" field > > 2. StringField for e.g. the path to the document on the file system > > 3. NumericDocValuesField for e.g. the documents modification date > > 4. Perhaps another StringField for the documents type (Word, > > PowerPoint) > > 4. Index these documents with IndexWriter > > 5. Search using IndexSearcher > > > > I'm sure there's a lot of Lucene tutorials around, for example: > > http://www.lucenetutorial.com/lucene-in-5-minutes.html. Covers pretty > much > > what I've mentioned above. > > > > From there, you can expand to add search results highlighting (summaries > / > > snippets) using e.g. PostingsHighlighter, faceted search using Lucene > > facets, Spelling correction and more. > > > > Also, are you aware of Solr, which is a search engine developed on top of > > Lucene. It takes care of all that for you, and has some pretty good > > tutorials and documentation. > > If you're not aiming to do something very challenging with these > documents, > > I think Solr can help you set up search very quickly, without writing any > > code. > > > > Shai > > > > > > On Tue, Jul 9, 2013 at 2:44 AM, Vinh Dang <dqvin...@gmail.com> wrote: > > > > > Sorry for my typo, > > > > > > I mean Lucene 4.3.1, > > > > > > Thank Beale from US for that :) > > > > > > --- > > > Best Regards > > > Vinh Dang > > > dqvin...@gmail.com > > > > > > > > > > > > > > > On Jul 8, 2013, at 9:46 PM, Vinh Dang <dqvin...@gmail.com> wrote: > > > > > > > Hi everyone, > > > > > > > > I am very new in Lucene, so please forgive me if my question is quite > > > stupid. > > > > > > > > I spent a whole day to google how to start with Lucene 4.6.1, but > > > failed. I found some clear tutorials, but they were written for too old > > > Lucene versions (almost 2). > > > > > > > > My tasks are: > > > > I have a folder which contains multiple .DOC files, with Unicode > > > characters (actually, they are Vietnamese characters). > > > > I want to index this folder with Lucene (4.6.1 is the best, but > another > > > versions is OK). > > > > > > > > Could you give a point to start? > > > > > > > > Thank you very much, > > > > > > > > --- > > > > Best Regards > > > > Vinh Dang > > > > dqvin...@gmail.com > > > > > > > > > > > > > > > > > > > > > > > > >