[ 
https://issues.apache.org/jira/browse/OAK-3806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15068246#comment-15068246
 ] 

Thomas Mueller commented on OAK-3806:
-------------------------------------

> //TODO FIXME Need to determine the blob length from id.

Actually, you already know the length from the CountingInputStream 
(cin.getCount()), so the comment seems unnecessary.

>  @param size size of binary content being written
> void uploaded(long timeTaken, TimeUnit unit, long size);

I'm sure size is bytes, but it would be good to document that.

> long getUploadSize();
> int getUploadCount();
> long getUploadTimeInSecs();
> long getDownloadSize();
> int getDownloadCount();
> CompositeData getUploadHistory();
> CompositeData getDownloadHistory();

getUploadTimeInSecs is there but getDownloadTimeInSecs is missing. For "size" 
and "time" it's not clear if it's average or total. The composite data is not 
clear what is measured (count, size, time?). What about:

{noformat}
long getUploadTotalSize();
long getDownloadTotalSize();
long getUploadTotalSeconds();
long getDownloadTotalSeconds();
CompositeData getUploadSizeHistory();
CompositeData getDownloadSizeHistory();
{noformat}

> //Download time might not be accurate as reading code might
> //be processing also as it moved further in stream. 

This could be solved, you would need to measure each read operation separately. 
It would be some overhead, but if wrapped with a BufferedInputStream then it 
wouldn't be all that bad.

> -        assertTrue(ds.getInputStream(dr.getIdentifier().toString()) 
> instanceof BufferedInputStream);
> +//        assertTrue(ds.getInputStream(dr.getIdentifier().toString()) 
> instanceof BufferedInputStream);

This could be avoided if the "wrapping order" is switched: wrap the 
CountingInputStream within a BufferedInputStream. I see it could mean double 
buffering: BufferedInputStream(CountingInputStream(BufferedInputStream(in))). 
Maybe this can be avoided.


> Collect and expose statistics related to BlobStore operations
> -------------------------------------------------------------
>
>                 Key: OAK-3806
>                 URL: https://issues.apache.org/jira/browse/OAK-3806
>             Project: Jackrabbit Oak
>          Issue Type: New Feature
>          Components: blob
>            Reporter: Chetan Mehrotra
>            Assignee: Chetan Mehrotra
>             Fix For: 1.3.13
>
>         Attachments: OAK-3806-v1.patch, OAK-3806-v2.patch, 
> blob-store-stats.png, blob-upload.png
>
>
> It would be useful to collect some statistics around BlobStore operations 
> like upload size, download size, how frequent uploads are done etc
> It should support following features
> * Collection across various implementation - For most cases just collecting 
> stats in {{DataStoreBlobStore}} and {{AbstractBlobStore}} should be sufficient
> * Collected stats should be exposed over JMX
> *Goals*
> # What are the number/size of downloads and uploads over period of time - The 
> time series data would help us understand any hot usage time
> # Are there too many repeated download for few blobIds - Later we can use 
> this information to cache such binary content locally and avoid hitting 
> remote stores (specially useful for RDB/Mongo-BlobStore)
> # What is the typical upload and download rate provided by the BlobStore - 
> Using this we can see if it varies, if its too low for Oak operational needs 
> etc



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to