Re: Feedback on S3DataStore JMX stats patch (OAK-4712)

Amit Jain Thu, 08 Sep 2016 08:26:44 -0700

Thanks Matt!! I will take a look tomorrow.

Regards
Amit


On Thu, Sep 8, 2016 at 8:53 PM, Matt Ryan <o...@mvryan.org> wrote:

> New patch attached to address feedback in OAK-4712 on the last patch.
>
> On Thu, Sep 1, 2016 at 11:00 AM, Matt Ryan <o...@mvryan.org> wrote:
>
>> This patch I believe addresses the issues identified with the previous
>> patch.  I've also uploaded it to OAK-4712.
>>
>> Looking for review and feedback.
>>
>> This patch introduces a new interface I've named NodeIdMapper.java.  Not
>> sure where to put this.  I've added it in order to be able to obtain the
>> blob ID for a named node within the context of a NodeStoreService class by
>> using an anonymous class.  However, most of the functionality in each case
>> (DocumentNodeStoreService and SegmentNodeStoreService) is similar.  This
>> could probably be addressed with a lot less code duplication by using an
>> abstract base class, but since I'm not sure where would be best to put such
>> a class.  Ideas?
>>
>> I also want to call attention to how the blob ID is being used in these
>> anonymous classes in the *NodeStoreService classes.  The IDs I was getting
>> back had trailing information with a '#' followed by some number of
>> digits.  I just split the string and discarded the last part.  Is there a
>> better way to do this?
>>
>>
>> On Sun, Aug 28, 2016 at 10:04 PM, Amit Jain <am...@ieee.org> wrote:
>>
>>> Hi Matt,
>>>
>>> I have directly replied to your comments on the jira.
>>>
>>> Thanks
>>> Amit
>>>
>>> On Sat, Aug 27, 2016 at 4:22 AM, Matt Ryan <o...@mvryan.org> wrote:
>>>
>>> > Use this patch instead; updated patch from latest in trunk.
>>> >
>>> > On Fri, Aug 26, 2016 at 4:41 PM, Matt Ryan <o...@mvryan.org> wrote:
>>> >
>>> >> Hi Oak Devs,
>>> >>
>>> >> I've created OAK-4712 and submitted a patch for the same.  I've
>>> attached
>>> >> the same patch to this email.
>>> >>
>>> >> The submission is to add a new MBean, S3DataStoreStats, which will
>>> allow
>>> >> reporting via JMX about the state of the S3DataStore.  Two metrics are
>>> >> intended.  The first is to report the number of files that are in the
>>> sync
>>> >> state, meaning they have been added to the S3DataStore but are not yet
>>> >> completely copied into S3.  The second is to allow to query the sync
>>> state
>>> >> of a file - a return value of true would mean that the file provided
>>> is
>>> >> fully synchronized.  This uses an external file name, not the
>>> internal ID.
>>> >>
>>> >> I have some questions about the implementation first:
>>> >> 1. For the request to query the sync state of a file, should the file
>>> >> name provided be a full path to a local file or a path to a file
>>> within OAK
>>> >> (e.g. /content/dam/myimage.jpg)?  Current implementation uses a local
>>> file
>>> >> path but I have been wondering if it should be an OAK path.
>>> >> 2. For the request to query the sync state of a file, when converting
>>> >> from the externally-supplied file name to an internal DataIdentifier,
>>> this
>>> >> implementation is performing the same calculation to determine the
>>> internal
>>> >> ID name as is done when a file is stored.  I have a number of
>>> concerns with
>>> >> this:
>>> >>    - It is inefficient - the entire file has to be read and digested
>>> in
>>> >> order to compute the internal ID.  This takes a long time for large
>>> assets.
>>> >>    - I've essentially duplicated the logic from CachingDataStore into
>>> >> S3DataStore to compute the internal ID.  I hate duplicating the code,
>>> but I
>>> >> am trying to avoid exposing internal IDs in API, and I am not seeing
>>> a good
>>> >> way in the current implementation to avoid this without either
>>> modifying
>>> >> public API to CachingDataStore, or exposing the internal ID via API,
>>> or
>>> >> both.
>>> >>
>>> >> Any suggestions on these two issues?
>>> >>
>>> >>
>>> >> I'm also experiencing a problem with this patch.  In my testing it
>>> >> appears to work fine, until I delete a file.  For example, if I
>>> delete an
>>> >> asset via the REST API, I will see the asset deleted in CRXDE.
>>> However,
>>> >> the file still remains in S3.  This MBean as implemented only knows
>>> how to
>>> >> check with S3DataStore and the corresponding backend, and these all
>>> appear
>>> >> to believe the file still exists.  So the MBean continues to report
>>> that
>>> >> the file's sync state is synchronized (i.e. isFileSynced() returns
>>> true)
>>> >> even though the file has been removed from the JCR.  Any suggestions
>>> on how
>>> >> to resolve this?
>>> >>
>>> >>
>>> >> Finally, any feedback on the patch is welcome.  And if I did the
>>> process
>>> >> wrong please correct me (gently) - first time submission here.  Thanks
>>> >>
>>> >>
>>> >
>>>
>>
>>
>

Re: Feedback on S3DataStore JMX stats patch (OAK-4712)

Reply via email to