It might also be helpful to look at the AsterixDB external data and
indexing paper in CIKM'15 for inspiration...?
On Jun 5, 2016 11:11 AM, "Preston Carman" <[email protected]> wrote:

> As we consider creating a meta data file for each index, lets consider
> what other information could be stored with the index? What are the
> types of functionality do we need to have a complete indexing story?
> As I understand it, we support creating an index and searching using
> that index. Would we want to show the user a list of indexes? Menaka's
> e-mail suggest we need a way to update an index. What other
> queries/features should we support around indexes?
>
> Indexing Features
>  * Create index
>  * Search using index
>  * Update index???
>  * List indexes???
>  * Delete index???
>
> On Sat, Jun 4, 2016 at 10:18 PM, Menaka Madushanka
> <[email protected]> wrote:
> > Hi everyone,
> >
> > I came up with an implementation plan for the $subject. This will be
> able to
> > detect file content changes as well as deletions and additions.
> >
> > Methodology:
> > 1. Generate checksum (MD5/ SHA) for each file. These checksum values
> will be
> > written to a single properties file in following format.
> >
> > path_to_the_file=checksum_string
> >
>
> Is there anything else that we will eventually want in a metadata file?
>
> >
> > 2.In the first time run,  the checksum will be calculated and the
> properties
> > file will be created.
> >
>
> Sounds good.
>
> > 3. When running a query,
> >
> > The properties file will be read and loaded in to memory.
> > The checksum values will be checked for each file.
> > If any modification is detected, the index will be updated and the new
> > checksum value will be stored.
> >
> > In the process of checking the checksum, the path of the file will be
> taken
> > by the file itself and retrieve the checksum for that file from
> properties.
> > So, if any file insertion or deletion can be detected because we consider
> > the actual file first.
> >
>
> When you say run a query, is this a UPDATE query or a SEARCH query? I
> think at this point we only want to cause the update action to happen
> for a UPDATE query. The overhead of update a query before searching
> could be to much. Lets first get UPDATE working.
>
> > To make the process more clear, I have attached the flow diagram
> herewith.
> >
>
> I do not see the diagram. Apache will only forward certain types of
> attachments. Can you post a link to your diagram?
>
> > I'd be very happy to have any feedback on this approach.
> >
> > Thank you very much
> > Menaka
> >
> > --
> > Menaka Madushanka Jayawardena
> > Faculty of Engineering,
> > University of Peradeniyaya.
> > LinkedIn
> > TP:- 071 885 1183/ 071 350 5470
>

Reply via email to