>
> Also, I just realized here
> (http://www.erlang.org/doc/apps/erts/erl_ext_dist.html), cite:
> ===============
> A float is stored in string format. the format used in sprintf to
> format the float is "%.20e" (there are more bytes allocated than
> necessary)
> ===============
> So, every float requires 33 bytes off disk space. Not so efficient.
Reading specs I realized that using minor_version = 1 in
term_to_binary options makes floats be 9-bytes long instead 33.
It just search/replace in few files.
I tested my data blob with this map function:
fun({Doc}) ->
Emit(<<"raw">>, size(term_to_binary(Doc))),
Emit(<<"raw_1">>, size(term_to_binary(Doc, [{minor_version, 1}])))
end.
And received ~10% space (~600MB instead of ~700MB) usage decrease.
Do I need to file a bug in Jira for it?
--
----------------
Best regards
Alexey Loshkarev
mailto:[email protected]