What do you think about adding your description of the update process to the wiki [1]? We can use this as the start of documenting the indexing functionality. You have written a nice description and it would be nice to have the in a place that others can see it to learn about our indexing process.
[1] https://cwiki.apache.org/confluence/display/VXQUERY/Lucene+Indexing+Project+2016 On Thu, Jun 23, 2016 at 3:39 PM, Menaka Madushanka <[email protected]> wrote: > Hello, > > I modified the implementation to use only one argument for update index > query. > > So the new query structure would be, > > *update-index(index_folder)* > > Collection information is stored when creating the index for the first time > in build-index-on-collection query and stored as metadata. > > Thank you very much > Menaka > > On 24 June 2016 at 03:42, Menaka Madushanka <[email protected]> wrote: > >> Hello Steven, >> >> Almost done. :-) >> >> On 24 June 2016 at 03:16, Steven Jacobs <[email protected]> wrote: >> >>> Auto-correct is always changing your name when I don't pay attention, I >>> apologize Menaka. >>> Steven >>> >>> On Thu, Jun 23, 2016 at 2:45 PM, Steven Jacobs <[email protected]> wrote: >>> >>>> Melaka- One high level comment. I think it will be better to have >>>> update-index take a single argument as we discussed (just the index >>>> folder). The collection location can be saved as part of the metadata >>>> information in the collection folder. >>>> Steven >>>> >>>> On Wed, Jun 22, 2016 at 2:04 PM, Menaka Madushanka < >>>> [email protected]> wrote: >>>> >>>>> Hello, >>>>> >>>>> This is the summary of the implementation. (Included in Pull Request >>>>> message as well) >>>>> >>>>> *Update Index Query* >>>>> The update-index query takes two arguments, collection directory and >>>>> index directory. >>>>> It shares some of the functionalities from build-index-on-collection >>>>> query so, some changes were done to the following classes in order to use >>>>> them in updating index process and to maximize code reuse >>>>> >>>>> >>>>> 1. IndexConstructorUtil.java : Created a new function to get an >>>>> instance of IndexDocumentBuilder which can be used in IndexUpdater.java >>>>> class. >>>>> 2. IndexDocumentBuilder.java : Added a new string filed containing >>>>> the corresponding file path which is needed to retrieve a document >>>>> related >>>>> to an XML file. >>>>> >>>>> >>>>> *Metadata handling* >>>>> Here a POJO is created to properly manage the metadata for a file. >>>>> (XmlMetadata.java) >>>>> Currently it contains following fields. >>>>> >>>>> >>>>> 1. File path >>>>> 2. File Name (Not used) >>>>> 3. Checksum String >>>>> >>>>> When storing metadata, a HashMap is created with file path as the key >>>>> and XmlMetadata object. This map is then serialized and written to a file >>>>> named metadata.file and stored in the same directory where the index is >>>>> stored. >>>>> >>>>> *Update Index process* >>>>> >>>>> - If a file is detected as modified, the current index document >>>>> related to that file is deleted and newly created index document is >>>>> added. >>>>> - If a new file is detected, a new index document will be created >>>>> and added to the existing index. >>>>> - If the file is deleted, delete the index document related to that >>>>> file. >>>>> - After every task, update the metadata object and after all >>>>> processes completed, write the new metadata map to the file. >>>>> >>>>> Please review the pull request and merge. >>>>> >>>>> https://github.com/apache/vxquery/pull/62 >>>>> >>>>> Thank you >>>>> Menaka >>>>> >>>>> >>>>> -- >>>>> *Menaka Madushanka Jayawardena* >>>>> Faculty of Engineering, <http://www.pdn.ac.lk/eng> >>>>> University of Peradeniyaya. >>>>> LinkedIn <http://lk.linkedin.com/in/menakajayawardena> >>>>> TP:- 071 885 1183/ 071 350 5470 >>>>> >>>> >>>> >>> >> >> >> -- >> *Menaka Madushanka Jayawardena* >> Faculty of Engineering, <http://www.pdn.ac.lk/eng> >> University of Peradeniyaya. >> LinkedIn <http://lk.linkedin.com/in/menakajayawardena> >> TP:- 071 885 1183/ 071 350 5470 >> > > > > -- > *Menaka Madushanka Jayawardena* > Faculty of Engineering, <http://www.pdn.ac.lk/eng> > University of Peradeniyaya. > LinkedIn <http://lk.linkedin.com/in/menakajayawardena> > TP:- 071 885 1183/ 071 350 5470
