WITH SERDEPROPERTIES ('input.regex'='(\\w+) (\\w+) (\\w+) (\\w+)\\t(\\w+)
(\\w+) (\\w+)')The reason for double backslash is that Hive string constant will take one level of escaping, and the regular expression will take another level. Please let us know where you see the 'regex'='...' syntax. It's outdated. We need to update it. Zheng On Wed, Sep 9, 2009 at 9:15 PM, Mayuran Yogarajah < [email protected]> wrote: > I have a file in HDFS which has the following format: > c1<space>c2<space>c3<space>c4<tab>c5<space>c6<space>c7 > > where cX represents column X. > > Can someone please show me how I can create a table in Hive for this? > > I tried the following but it gave an error: > CREATE TABLE test ( > c1 STRING, > c2 STRING, > c3 STRING, > c4 STRING, > c5 STRING, > c6 STRING, > c7 STRING ) > ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.RegexSerDe' > WITH SERDEPROPERTIES ('regex'='(\w+) (\w+) (\w+) (\w+)\t(\w+) (\w+) (\w+)') > STORED AS TEXTFILE; > > hive> load data inpath '/user/hadoop/test' into table test; > > hive> select * from test; > OK > Failed with exception > java.io.IOException:org.apache.hadoop.hive.serde2.SerDeException: This table > does not have serde property "input.regex"! > > Thank you very much =) > > -- Yours, Zheng
