[ 
https://issues.apache.org/jira/browse/HUDI-234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16921122#comment-16921122
 ] 

Vinoth Chandar commented on HUDI-234:
-------------------------------------

+1 to [~DavisBroda] 's suggestion. First step could be just making it work. 

As for, specifically using the Instrumentation framework, it seems like you 
just launch the jvm with the agent once and there on, you get can fetch the 
size estimates? [~xleesf] that seems ok to me, unless this also "profiles" the 
app per se, which could make it really slow.. 

Moreover, what concerns me more from that stackoverflow thread are things like 
: "I tried this and got strange and unhelpful results. Strings were always 32, 
regardless of size. " . If it cannot estimate size of complex object graphs, 
then not sure how useful it is. 

 

I think we can implement a graceful fallback which uses some approximation or 
always assumes a certain fixed, configurable size.. i.e a FakeEstimator, which 
may cause additional spilling, but atleast works.. For eg, if you said, all 
objects are 1MB, and they end up being 1KB, you just spill a lot..but things 
still work.. does this approach make sense? 

Phase 2 after this could be , finding a very performant and accurate object 
size estimator that works across jvms (if such a things exists :) ) 

 

> Graceful degradation of ObjectSizeCalculator for non hotspot jvms
> -----------------------------------------------------------------
>
>                 Key: HUDI-234
>                 URL: https://issues.apache.org/jira/browse/HUDI-234
>             Project: Apache Hudi (incubating)
>          Issue Type: Bug
>          Components: Write Client
>    Affects Versions: 0.5.0
>            Reporter: Vinoth Chandar
>            Priority: Major
>
> https://github.com/apache/incubator-hudi/issues/860 bug report 



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

Reply via email to