Murtadha Hubail has posted comments on this change. Change subject: Add support for persistent local resources to IndexLifecycleManager ......................................................................
Patch Set 2: To answer your first point, here is some background behind the reason of this change. a) A resource location (Name) is passed to IndexDataFlowHelper as an absolute path from a FileSplit. b) The current design of PersistentLocalResourceRepository assumes that the mapping information between (Resource Name -> LocalResource) is already in memory, and all resources info is loaded in memory during startup. c) We also need to keep a different mapping information between (ID -> LocalResource) since the API of IIndexLifecycleManager (DatasetLifecycleManager in Asterix) is only using the resource id, so if it needs to access any other info regarding this resource (e.g. Dataset ID), it needs to use the (ID -> LocalResource) map. If we have a huge number of resources, all of that info will be kept in memory all the time until a resource is destroyed. I fixed this on the Asterix side of this change and made PersistentLocalResourceRepository load resources on demand and there is a cache between (Resource Name -> LocalResource) with a certain max size and got rid of the two mappings. But now DatasetLifecycleManager won’t be able to access the resource using the ID only. That’s why I introduced this change here to pass the name instead of the id. Using the Name, it will ask PersistentLocalResourceRepository for the resource using its Name and it will be fetched from the cache or disk. Another option I could’ve done is to pass the whole LocalResource instead of the name. This way DatasetLifecycleManager wouldn’t need to fetch it again at all. But then I found that we have a logic that is based on checking whether that resource actually exists or not. Coming back to the rest of your points: 2) I could change the interface to use Object, but it will again break the existing code of other clients. 3) If we all agree that String is the better identifier, I’m going to add deprecated to the resource ID methods. 4) Clients do not need to make this check since they should know what type of ILocalResourceRepository they are using. I added this check here because IndexDataflowHelper is a shared class between our clients. New clients could create a new class which implements IIndexDataflowHelper that doesn’t need to make this check. Hopefully we could implement point 3 and git rid of this check as well. I also wanted to make this check based on instanceof instead of a boolean but couldn’t due to projects dependancies. Hope this answers your points. -- To view, visit https://asterix-gerrit.ics.uci.edu/343 To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings Gerrit-MessageType: comment Gerrit-Change-Id: I2e8e974fc2f746959639ce94351f8e419a7f9093 Gerrit-PatchSet: 2 Gerrit-Project: hyracks Gerrit-Branch: master Gerrit-Owner: Murtadha Hubail <[email protected]> Gerrit-Reviewer: Ian Maxon <[email protected]> Gerrit-Reviewer: Jenkins <[email protected]> Gerrit-Reviewer: Murtadha Hubail <[email protected]> Gerrit-Reviewer: Till Westmann <[email protected]> Gerrit-Reviewer: Yingyi Bu <[email protected]> Gerrit-Reviewer: Young-Seok Kim <[email protected]> Gerrit-Reviewer: abdullah alamoudi <[email protected]> Gerrit-HasComments: No
