Hi dujinhang, See, http://wiki.apache.org/hadoop/Hive/UserGuide
add jar ../build/contrib/hive_contrib.jar; CREATE TABLE apachelog ( host STRING, identity STRING, user STRING, time STRING, request STRING, status STRING, size STRING, referer STRING, agent STRING) ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.RegexSerDe' WITH SERDEPROPERTIES ( "input.regex" = "([^ ]*) ([^ ]*) ([^ ]*) (-|\\[[^\\]]*\\]) ([^ \"]*|\"[^\"]*\") (-|[0-9]*) (-|[0-9]*)(?: ([^ \"]*|\"[^\"]*\") ([^ \"]*|\"[^\"]*\"))?", "output.format.string" = "%1$s %2$s %3$s %4$s %5$s %6$s %7$s %8$s %9$s" ) STORED AS TEXTFILE; Maybe you need to set the *'output.format.string*' property. HTH, - Youngwoo 2011/5/31 jinhang du <[email protected]> > How does the columns in the table match the "input.regex" ? > > In other words, which part of the regex matches the columns of the table? > > Will anybody offer some help? > > 2011/5/30 YUYANG LAN <[email protected]> > >> hi, how about this ? >> >> (.+)&&&(.+?)(?:\^\^.*)? >> >> On Mon, May 30, 2011 at 6:07 PM, jinhang du <[email protected]> wrote: >> > My data format is as follows: >> > a&&&b >> > c&&&b^^xyz >> > c&&&d^^hdo >> > create table f(str1 string, str2 string) ROW FORMAT SERDE >> > 'org.apache.hadoop.hive.contrib.serde2.RegexSerDe' >> > With SERDEPROPERTIES ( >> > "input.regex"="(.+)&&&(.+)(\^\^.+)?" >> > ) >> > >> > My aim is : >> > a b >> > c b >> > c d >> > However , >> > a b >> > c b^^xyz >> > c d^^hdo >> > So how to fix the regex to get the right answer? >> > Thank you for help. >> > -- >> > dujinhang >> > >> >> >> >> -- >> ------------------------------------------------------- >> DAVID RAN UYOU // >> > > > > -- > dujinhang >
