Thank you very much Till and Michael. I'll take a look. On 7 June 2016 at 23:58, Michael J. Carey <[email protected]> wrote:
> It might also be helpful to look at the AsterixDB external data and > indexing paper in CIKM'15 for inspiration...? > On Jun 5, 2016 11:11 AM, "Preston Carman" <[email protected]> wrote: > > > As we consider creating a meta data file for each index, lets consider > > what other information could be stored with the index? What are the > > types of functionality do we need to have a complete indexing story? > > As I understand it, we support creating an index and searching using > > that index. Would we want to show the user a list of indexes? Menaka's > > e-mail suggest we need a way to update an index. What other > > queries/features should we support around indexes? > > > > Indexing Features > > * Create index > > * Search using index > > * Update index??? > > * List indexes??? > > * Delete index??? > > > > On Sat, Jun 4, 2016 at 10:18 PM, Menaka Madushanka > > <[email protected]> wrote: > > > Hi everyone, > > > > > > I came up with an implementation plan for the $subject. This will be > > able to > > > detect file content changes as well as deletions and additions. > > > > > > Methodology: > > > 1. Generate checksum (MD5/ SHA) for each file. These checksum values > > will be > > > written to a single properties file in following format. > > > > > > path_to_the_file=checksum_string > > > > > > > Is there anything else that we will eventually want in a metadata file? > > > > > > > > 2.In the first time run, the checksum will be calculated and the > > properties > > > file will be created. > > > > > > > Sounds good. > > > > > 3. When running a query, > > > > > > The properties file will be read and loaded in to memory. > > > The checksum values will be checked for each file. > > > If any modification is detected, the index will be updated and the new > > > checksum value will be stored. > > > > > > In the process of checking the checksum, the path of the file will be > > taken > > > by the file itself and retrieve the checksum for that file from > > properties. > > > So, if any file insertion or deletion can be detected because we > consider > > > the actual file first. > > > > > > > When you say run a query, is this a UPDATE query or a SEARCH query? I > > think at this point we only want to cause the update action to happen > > for a UPDATE query. The overhead of update a query before searching > > could be to much. Lets first get UPDATE working. > > > > > To make the process more clear, I have attached the flow diagram > > herewith. > > > > > > > I do not see the diagram. Apache will only forward certain types of > > attachments. Can you post a link to your diagram? > > > > > I'd be very happy to have any feedback on this approach. > > > > > > Thank you very much > > > Menaka > > > > > > -- > > > Menaka Madushanka Jayawardena > > > Faculty of Engineering, > > > University of Peradeniyaya. > > > LinkedIn > > > TP:- 071 885 1183/ 071 350 5470 > > > -- *Menaka Madushanka Jayawardena* Faculty of Engineering, <http://www.pdn.ac.lk/eng> University of Peradeniyaya. LinkedIn <http://lk.linkedin.com/in/menakajayawardena> TP:- 071 885 1183/ 071 350 5470
