It is not a good practice to do this. Just store a reference to the binary data stored on HDFS.
> Am 09.01.2022 um 15:34 schrieb weoccc <weo...@gmail.com>: > > > Hi , > > I want to store binary data (such as images) into hive table but the binary > data column might be much larger than other columns per row. I'm worried > about the query performance. One way I can think of is to separate binary > data storage from other columns by creating 2 hive tables and run 2 separate > spark query and join them later. > > Later, I found parquet has supported column split into different files as > shown here: > https://parquet.apache.org/documentation/latest/ > > I'm wondering if spark sql already supports that ? If so, how to use ? > > Weide