hello all,
DataFrame internally uses a different encoding for values then what the
user sees. i assume the same is true for Dataset?

if so, does this means that a function like Dataset.map needs to convert
all the values twice (once to user format and then back to internal
format)? or is it perhaps possible to write scala functions that operate on
internal formats and avoid this?

i am excited to see lambas back in full force in DataFrame/Dataset world.
the few functions on DataFrame that already accepted lambas are not very
user friendly. but i worried that what i want (blackbox/general scala
functions/lambas so i am not restricted to a few etl-like operators) is at
odds with the design of Dataset/DataFrame.

best, koert

Reply via email to