date:20220109

Re: hive table with large column data size

2022-01-09 Thread Jörn Franke

It is not a good practice to do this. Just store a reference to the binary data 
stored on HDFS.

> Am 09.01.2022 um 15:34 schrieb weoccc :
> 
> 
> Hi ,
> 
> I want to store binary data (such as images) into hive table but the binary 
> data column might be much larger than other columns per row.  I'm worried 
> about the query performance. One way I can think of is to separate binary 
> data storage from other columns by creating 2 hive tables and run 2 separate 
> spark query and join them later. 
> 
> Later, I found parquet has supported column split into different files as 
> shown here: 
> https://parquet.apache.org/documentation/latest/
> 
> I'm wondering if spark sql already supports that ? If so, how to use ? 
> 
> Weide

hive table with large column data size

2022-01-09 Thread weoccc

Hi ,

I want to store binary data (such as images) into hive table but the binary
data column might be much larger than other columns per row.  I'm worried
about the query performance. One way I can think of is to separate binary
data storage from other columns by creating 2 hive tables and run 2
separate spark query and join them later.

Later, I found parquet has supported column split into different files as
shown here:
https://parquet.apache.org/documentation/latest/

I'm wondering if spark sql already supports that ? If so, how to use ?

Weide

Re: hive table with large column data size

hive table with large column data size

2 matches

Site Navigation

Mail list logo

Footer information