Hi,
I used a hive insert/select statement to convert our log files into the Parquet
format. I found that the size of each Parquet file generated is under 400mb.
Is there any way I can increase the size of the Parquet files generated?
Thanks.
Ey-Chih Chow
Hi all,
I am new to Hive and hopefully this is going to be an easy thing to solve
for someone with more experience, but I am having trouble doing it on my
own.
On my EC2 app server I am running the following command with no error:
*beeline -u jdbc:hive2://master*
This is working on Hive 13 which
Hi Cheolsoo,
Thanks for the correction. I took that for granted and didn't actually
check the code to verify. Yes, from the Spark version (1.2), I did see
their parser etc. Below is a portion of the README from Spark's sql package
for reference.
Thanks,
Xuefu
Spark SQL is broken up into four
I don’t understand the question.
Why do you want them larger?
Are you looking to merge parquet files?
Are you looking to append to parquet files?
Are you concerned about the small size?
[http://www.cisco.com/web/europe/images/email/signature/est2014/logo_06.png?ct=1398192119726]
Grant Overby
Bhavana,
Could you send me (omal...@apache.org) the incorrect ORC file? Which
file system were you using? hdfs? Which version of Hadoop and Hive?
Thanks,
Owen
On Fri, May 22, 2015 at 9:37 AM, Grant Overby (groverby) grove...@cisco.com
wrote:
I’m getting the following exception when
I’m getting the following exception when Hive executes a query on an external
table. It seems the postscript isn’t written even though .close() is called and
returns normally. Any thoughts?
java.io.IOException: Malformed ORC file
The reason I want them larger is to improve performance of downstream
map/reduce and Spark jobs. The larger the file sizes can be, the better the
performance of the downstream jobs can achieve. I would like to know if there
is any configuration parameter that I can set for Hive to generate
I sent a link to you.
File system is hdfs.
Versions:
hdp HDP-2.2.4.2-2
hdfs 2.6.0.2.2
MapReduce2 2.6.0.2.2
YARN 2.6.0.2.2
hive 0.14.0.2.2
tez 0.5.2.2.2
It was a tez query that caused the exception, but I doubt that’s relevant.