Actually, there is a legacy feature in Hive that would do exactly what you need :
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Select#LanguageManualSelect-REGEXColumnSpecification Another way would simply to use dynamic partitioning : INSERT INTO TABLE employee_orc PARTITION *(country, office)* select * from employee where country='USA' and office='HQ-TX'; The only downside of this method (that I know of), is that Hive will put a lock on every partitions, even if only one is inserted. I also know a third solution, but I can't talk about it yet. On Tue, Dec 13, 2016 at 3:27 PM, Joaquin Alzola <joaquin.alz...@lebara.com> wrote: > Hi Suresh > > > > I choose the * and not the specific fields because I have 520 columns. > > The data that I tested was only a testing ground. > > > > I suppose then that I need to select the 520 fileds. L > > > > > > > > *From:* Suresh Kumar Sethuramaswamy [mailto:rock...@gmail.com] > *Sent:* 13 December 2016 14:19 > *To:* user@hive.apache.org > *Subject:* Re: PARTITION error because different columns size > > > > Hi Joaquin > > > > In hive , when u run 'select * from employee' it is going to return > the partitioned columns also at the end, whereas you don't want that to be > inserted into ur ORC table , so ur insert query should look like > > > > INSERT INTO TABLE employee_orc PARTITION (country='USA', > office='HQ-TX') select eid,salary from employee where country='USA' and > office='HQ-TX'; > > > > > > Remember partition in hive is a physical folder name > > > > Regards > > Suresh > > > > > > > > On Tue, Dec 13, 2016 at 6:37 AM Joaquin Alzola <joaquin.alz...@lebara.com> > wrote: > > > > > > > > > > > > > > > > > Hi List > > > > > > > > I change Spark to 2.0.2 and Hive 2.0.1. > > > > I have the bellow tables but the INSERT INTO TABLE employee_orc PARTITION > (country='USA', office='HQ-TX') select * from employee where country='USA' > and office='HQ-TX'; > > > > Is giving me à Cannot insert into table `default`.`employee_orc` because > the number of columns are different: need 4 columns, but query has 6 > columns.; > > > > > > > > When doing select it is adding the Partition as columns …. > > > > > > > > CREATE TABLE IF NOT EXISTS employee ( eid int, name String, > > > > salary String, destination String) > > > > COMMENT 'Employee details' > > > > PARTITIONED BY(country string, office string) > > > > ROW FORMAT DELIMITED > > > > FIELDS TERMINATED BY '\t' > > > > LINES TERMINATED BY '\n' > > > > STORED AS TEXTFILE; > > > > > > > > CREATE TABLE IF NOT EXISTS employee_orc ( eid int, name String, > > > > salary String, destination String) > > > > COMMENT 'Employee details' > > > > PARTITIONED BY(country string, office string) > > > > STORED AS ORC tblproperties ("orc.compress"="ZLIB"); > > > > > > > > 0: jdbc:hive2://localhost:10000> LOAD DATA LOCAL INPATH > '/mnt/sample.txt.gz' INTO TABLE employee PARTITION (country='USA', > office='HQ-TX'); > > > > +---------+--+ > > > > | Result | > > > > +---------+--+ > > > > +---------+--+ > > > > No rows selected (0.685 seconds) > > > > 0: jdbc:hive2://localhost:10000> select * from employee; > > > > +-------+--------------+---------+--------------------+----- > -----+---------+--+ > > > > | eid | name | salary | destination | country | office > | > > > > +-------+--------------+---------+--------------------+----- > -----+---------+--+ > > > > | 1201 | Gopal | 45000 | Technical manager | USA | HQ-TX > | > > > > | 1202 | Manisha | 45000 | Proof reader | USA | HQ-TX > | > > > > | 1203 | Masthanvali | 40000 | Technical writer | USA | HQ-TX > | > > > > | 1204 | Kiran | 40000 | Hr Admin | USA | HQ-TX > | > > > > | 1205 | Kranthi | 30000 | Op Admin | USA | HQ-TX > | > > > > +-------+--------------+---------+--------------------+----- > -----+---------+--+ > > > > 5 rows selected (0.358 seconds) > > > > 0: jdbc:hive2://localhost:10000> INSERT INTO TABLE employee_orc PARTITION > (country='USA', office='HQ-TX') select * from employee where country='USA' > and office='HQ-TX'; > > > > Error: org.apache.spark.sql.AnalysisException: Cannot insert into table > `default`.`employee_orc` because the number of columns are different: need > 4 columns, but query has 6 columns.; (state=,code=0) > > > > > > > > > > > > 0: jdbc:hive2://localhost:10000> describe employee_orc; > > > > +--------------------------+------------+----------+--+ > > > > | col_name | data_type | comment | > > > > +--------------------------+------------+----------+--+ > > > > | eid | int | NULL | > > > > | name | string | NULL | > > > > | salary | string | NULL | > > > > | destination | string | NULL | > > > > | country | string | NULL | > > > > | office | string | NULL | > > > > | # Partition Information | | | > > > > | # col_name | data_type | comment | > > > > | country | string | NULL | > > > > | office | string | NULL | > > > > +--------------------------+------------+----------+--+ > > > > > > > > 0: jdbc:hive2://localhost:10000> describe employee; > > > > +--------------------------+------------+----------+--+ > > > > | col_name | data_type | comment | > > > > +--------------------------+------------+----------+--+ > > > > | eid | int | NULL | > > > > | name | string | NULL | > > > > | salary | string | NULL | > > > > | destination | string | NULL | > > > > | country | string | NULL | > > > > | office | string | NULL | > > > > | # Partition Information | | | > > > > | # col_name | data_type | comment | > > > > | country | string | NULL | > > > > | office | string | NULL | > > > > +--------------------------+------------+----------+--+ > > > > 10 rows selected (0.045 seconds) > > > > > > This email is confidential and may be subject to privilege. If you are not > the intended recipient, please do not copy or disclose its content but > contact the sender immediately upon receipt. > > > > > This email is confidential and may be subject to privilege. If you are not > the intended recipient, please do not copy or disclose its content but > contact the sender immediately upon receipt. >