Jay can you give first 3 lines of your gz file On Fri, May 1, 2015 at 10:53 AM, Kumar Jayapal <[email protected]> wrote:
> Alex, > > > I followed the same steps as mentioned in the site. Once I load data into > table which is create below > > > > Created table CREATE TABLE raw (line STRING) PARTITIONED BY (FISCAL_YEAR > smallint, FISCAL_PERIOD smallint) > STORED AS TEXTFILE; > > and loaded it with data. > > LOAD DATA LOCAL INPATH '/tmp/weblogs/20090603-access.log.gz' INTO TABLE > raw; > > > > when I say select * from raw it shows all null values. > > > NULLNULLNULLNULLNULLNULLNULLNULL > NULLNULLNULLNULLNULLNULLNULLNULL > NULLNULLNULLNULLNULLNULLNULLNULL > NULLNULLNULLNULLNULLNULLNULLNULL > Why is not show showing the actual data in file. will it show once I load > it to parque table? > > Please let me know if I am doing anything wrong. > > I appreciate your help. > > > Thanks > jay > > > > Thank you very much for you help Alex, > > > On Wed, Apr 29, 2015 at 3:43 PM, Alexander Pivovarov <[email protected] > > wrote: > >> 1. Create external textfile hive table pointing to /extract/DBCLOC and >> specify CSVSerde >> >> if using hive-0.14 and newer use this >> https://cwiki.apache.org/confluence/display/Hive/CSV+Serde >> if hive-0.13 and older use https://github.com/ogrodnek/csv-serde >> >> You do not even need to unzgip the file. hive automatically unzgip data >> on select. >> >> 2. run simple query to load data >> insert overwrite table <orc_table> >> select * from <csv_table> >> >> On Wed, Apr 29, 2015 at 3:26 PM, Kumar Jayapal <[email protected]> >> wrote: >> >>> Hello All, >>> >>> >>> I have this table >>> >>> >>> CREATE TABLE DBCLOC( >>> BLwhse int COMMENT 'DECIMAL(5,0) Whse', >>> BLsdat string COMMENT 'DATE Sales Date', >>> BLreg_num smallint COMMENT 'DECIMAL(3,0) Reg#', >>> BLtrn_num int COMMENT 'DECIMAL(5,0) Trn#', >>> BLscnr string COMMENT 'CHAR(1) Scenario', >>> BLareq string COMMENT 'CHAR(1) Act Requested', >>> BLatak string COMMENT 'CHAR(1) Act Taken', >>> BLmsgc string COMMENT 'CHAR(3) Msg Code') >>> PARTITIONED BY (FSCAL_YEAR smallint, FSCAL_PERIOD smallint) >>> STORED AS PARQUET; >>> >>> have to load from hdfs location /extract/DBCLOC/DBCL0301P.csv.gz to >>> the table above >>> >>> >>> Can any one tell me what is the most efficient way of doing it. >>> >>> >>> Thanks >>> Jay >>> >> >> > -- Nitin Pawar
