Hi dujinhang,

See, http://wiki.apache.org/hadoop/Hive/UserGuide

add jar ../build/contrib/hive_contrib.jar;

CREATE TABLE apachelog (
  host STRING,
  identity STRING,
  user STRING,
  time STRING,
  request STRING,
  status STRING,
  size STRING,
  referer STRING,
  agent STRING)
ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.RegexSerDe'
WITH SERDEPROPERTIES (
  "input.regex" = "([^ ]*) ([^ ]*) ([^ ]*) (-|\\[[^\\]]*\\]) ([^
\"]*|\"[^\"]*\") (-|[0-9]*) (-|[0-9]*)(?: ([^ \"]*|\"[^\"]*\") ([^
\"]*|\"[^\"]*\"))?",
  "output.format.string" = "%1$s %2$s %3$s %4$s %5$s %6$s %7$s %8$s %9$s"
)
STORED AS TEXTFILE;


Maybe you need to set the *'output.format.string*' property.

HTH,

- Youngwoo

2011/5/31 jinhang du <[email protected]>

> How does the columns in the table match the "input.regex" ?
>
> In other words, which part of the regex  matches the columns of the table?
>
> Will anybody offer  some help?
>
> 2011/5/30 YUYANG LAN <[email protected]>
>
>> hi, how about this ?
>>
>> (.+)&&&(.+?)(?:\^\^.*)?
>>
>> On Mon, May 30, 2011 at 6:07 PM, jinhang du <[email protected]> wrote:
>> > My data format is as follows:
>> > a&&&b
>> > c&&&b^^xyz
>> > c&&&d^^hdo
>> > create table f(str1 string, str2 string) ROW FORMAT SERDE
>> > 'org.apache.hadoop.hive.contrib.serde2.RegexSerDe'
>> > With SERDEPROPERTIES (
>> > "input.regex"="(.+)&&&(.+)(\^\^.+)?"
>> > )
>> >
>> > My aim is :
>> > a b
>> > c b
>> > c d
>> > However ,
>> > a b
>> > c b^^xyz
>> > c d^^hdo
>> > So how to fix the regex to get the right answer?
>> > Thank you for help.
>> > --
>> > dujinhang
>> >
>>
>>
>>
>> --
>> -------------------------------------------------------
>> DAVID RAN UYOU //
>>
>
>
>
> --
> dujinhang
>

Reply via email to