hive metastore's schematool -upgradeSchema on postgres throws an error on CREATE TABLE PART_COL_STATS

2015-04-28 Thread jun aoki
Hi hive community, I am new to Hive and it may be a stupid question but let me know if you know the answer. I am attempting to upgrade hive metastore schema from 0.12 to 0.14. The whole log is here [2] At the end, VERSION table shows SCHEMA_VERSION 0.14.0. [1] which was 0.12.0 and it seems

Re: ORC file across multiple HDFS blocks

2015-04-28 Thread Grant Overby (groverby)
Expanding on Alan’s post: Files are intended to span many blocks and a single file may be read by many mappers. In order for a file to be read by many mappers, it goes through a process called input splits which splits the input around hdfs block boundaries. If a unit of data within a file

RE: hive metastore's schematool -upgradeSchema on postgres throws an error on CREATE TABLE PART_COL_STATS

2015-04-28 Thread Mich Talebzadeh
Hi, My version is 0.14 on Oracle metastore and there is no drop command there. Table seems to keep partition column stats. So it is just stats table CREATE TABLE PART_COL_STATS ( CS_ID NUMBER NOT NULL, DB_NAME VARCHAR2(128) NOT NULL, TABLE_NAME VARCHAR2(128) NOT NULL,

Re: ORC file across multiple HDFS blocks

2015-04-28 Thread Demai Ni
Alan and Grant, many thanks. Grant's comment is exact on the point that I am exploring. A bit background here. I am working on a MPP way to read ORC files through this C++ API (https://github.com/hortonworks/orc) by Owen and team. The MPP mechanism is using one(or several) independent process

Re: ORC file across multiple HDFS blocks

2015-04-28 Thread Owen O'Malley
You can also use the C++ reader to read a set of stripes. Look at the ReaderOptions.range(offset, length), which selects the range of stripes to process in terms of bytes. .. Owen On Tue, Apr 28, 2015 at 11:02 AM, Demai Ni nid...@gmail.com wrote: Alan and Grant, many thanks. Grant's comment

default number of reducers

2015-04-28 Thread Shushant Arora
In Normal MR job can I configure ( cluster wide) default number of reducers - if I don't specify any reducers in my job.

Re: ORC file across multiple HDFS blocks

2015-04-28 Thread Demai Ni
Owen, cool. That is great. Thanks Demai On Tue, Apr 28, 2015 at 11:10 AM, Owen O'Malley omal...@apache.org wrote: You can also use the C++ reader to read a set of stripes. Look at the ReaderOptions.range(offset, length), which selects the range of stripes to process in terms of bytes. ..