[
https://issues.apache.org/jira/browse/HUDI-234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16921122#comment-16921122
]
Vinoth Chandar commented on HUDI-234:
-------------------------------------
+1 to [~DavisBroda] 's suggestion. First step could be just making it work.
As for, specifically using the Instrumentation framework, it seems like you
just launch the jvm with the agent once and there on, you get can fetch the
size estimates? [~xleesf] that seems ok to me, unless this also "profiles" the
app per se, which could make it really slow..
Moreover, what concerns me more from that stackoverflow thread are things like
: "I tried this and got strange and unhelpful results. Strings were always 32,
regardless of size. " . If it cannot estimate size of complex object graphs,
then not sure how useful it is.
I think we can implement a graceful fallback which uses some approximation or
always assumes a certain fixed, configurable size.. i.e a FakeEstimator, which
may cause additional spilling, but atleast works.. For eg, if you said, all
objects are 1MB, and they end up being 1KB, you just spill a lot..but things
still work.. does this approach make sense?
Phase 2 after this could be , finding a very performant and accurate object
size estimator that works across jvms (if such a things exists :) )
> Graceful degradation of ObjectSizeCalculator for non hotspot jvms
> -----------------------------------------------------------------
>
> Key: HUDI-234
> URL: https://issues.apache.org/jira/browse/HUDI-234
> Project: Apache Hudi (incubating)
> Issue Type: Bug
> Components: Write Client
> Affects Versions: 0.5.0
> Reporter: Vinoth Chandar
> Priority: Major
>
> https://github.com/apache/incubator-hudi/issues/860 bug report
--
This message was sent by Atlassian Jira
(v8.3.2#803003)