Re: Regex and serde with hive

Loren Siebert Thu, 22 Dec 2011 23:27:56 -0800

The input regexp does not look right to me. You are expecting a space between 
groups, but your example contains no spaces. And where do you handle the 
first/last quotes? Wouldn’t it look more like this:
"input.regex" = “\"([^\"~]*)[\"~]*([^\"~]*)[\"~]*([^\"~]*)\""


Rather than trying to tackle it all at once, I find it easier to start with a 
table of one column and then build up from there until I have all my columns.

On Dec 22, 2011, at 8:49 PM, Raghunath, Ranjith wrote:

> I have been struggling with this for a while so I would appreciate any advice 
> that you any of you may have.
>  
> I have a file of the format
>  
> “Xyz”~”qsd”~”1234”
>  
> I created the following table definition to get the data loaded
>  
> CREATE TABLE dummy
> (f1   string,
>   f2    string,
>   f3     string)
> ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.RegexSerDe'
> WITH SERDEPROPERTIES  (
> "input.regex" = "([^\"~]*) ([^\"~]*) ([^\"~]*)?",
> "output.format.string" = "%1$s %2$s %3$s");
>  
> When I load the data in and try to perform a select get NULL values. Thanks 
> again.
> Thank you,
> Ranjith
>

Re: Regex and serde with hive

Reply via email to