Writing columns having NULL as some string using OpenCSVSerde

2017-03-27 Thread Punit Shetty
Hi,

I'm using 'org.apache.hadoop.hive.serde2.OpenCSVSerde' to write hive table
data.
CREATE TABLE testtable ROW FORMAT SERDE
'org.apache.hadoop.hive.serde2.OpenCSVSerde'
WITH SERDEPROPERTIES (
   "separatorChar" = ",",
   "quoteChar" = "'"
)
STORED AS TEXTFILE LOCATION '' AS
select * from foo;

So, if 'foo' table has empty strings in it, for eg: '1','2','' . The empty
strings are written as is to the textfile. The data in textfile reads
'1','2',''

But if 'foo' contains null values, for eg: '1','2',null. The null value is
not written in the text file.
The data in the textfile reads '1','2',

How do I make sure that the nulls are properly written to the textfile
using csv serde. Either written as empty strings or any other string say
"nullstring"?

I tried using ""serialization.null.format" = "" in the SERDEPROPERTIES. But
that doesn't seem to work probably because csvserde doesn't support it or
its the proper use case.

Please guide me on how to write nulls to csv files.

Thanks,
Punit


Re: Missing column headers from S3 file

2017-03-09 Thread Punit Shetty
Yes, running the CTAS command on hive CLI does print the headers once the
query execution is completed.

I think this feature is yet to be implemented: https://issues.
apache.org/jira/browse/HIVE-12860

On Thu, Mar 9, 2017 at 9:50 AM, Punit Shetty <punitshett...@gmail.com>
wrote:

> Yes, running the CTAS command on hive CLI does print the headers once the
> query execution is completed.
>
> I think this feature is yet to be implemented: https://issues.
> apache.org/jira/browse/HIVE-12860
>
>
> On Wed, Mar 8, 2017 at 5:51 PM, Pushkar.Gujar <pushkarvgu...@gmail.com>
> wrote:
>
>> apologies,
>>
>> my command was
>>
>> hive -e "set hive.cli.print.header=true; *select * from abc*" >>
>> output.txt
>>
>>
>> Thank you,
>> *Pushkar Gujar*
>>
>>
>> On Wed, Mar 8, 2017 at 8:49 PM, Pushkar.Gujar <pushkarvgu...@gmail.com>
>> wrote:
>>
>>> if you were running CTAS command from hive CLI, you might have noticed
>>> that headers get printed in CLI once the query execution is completed. I
>>> think the property hive.cli.print.header is there to only print headers
>>> only in cli.
>>>
>>> Not sure about S3, but I tried below which worked perfectly
>>>
>>> hive -e "set hive.cli.print.header=true; create table." >> output.txt
>>>
>>>
>>> Thank you,
>>> *Pushkar Gujar*
>>>
>>>
>>> On Wed, Mar 8, 2017 at 6:47 PM, Punit Shetty <punitshett...@gmail.com>
>>> wrote:
>>>
>>>> Hi,
>>>>
>>>> I'm trying to get the column headers into a S3 file.
>>>>
>>>> When I run this on the hive command line:
>>>> set hive.cli.print.header=true;
>>>> select * from abc;
>>>>
>>>> Things work fine, I'm able to see the column headers with the table
>>>> data on the console.
>>>>
>>>> But my query is, when I try something like this:
>>>> set hive.cli.print.header=true;
>>>> CREATE TABLE test123 ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
>>>> LINES TERMINATED BY '\n' STORED AS TEXTFILE LOCATION 's3://somelocation' AS
>>>> select * abc;
>>>>
>>>> The column headers are missing from the S3 file.
>>>>
>>>> Is this the expected behaviour or am I missing something?
>>>>
>>>> Thanks,
>>>> Punit
>>>>
>>>
>>>
>>
>


Missing column headers from S3 file

2017-03-08 Thread Punit Shetty
Hi,

I'm trying to get the column headers into a S3 file.

When I run this on the hive command line:
set hive.cli.print.header=true;
select * from abc;

Things work fine, I'm able to see the column headers with the table data on
the console.

But my query is, when I try something like this:
set hive.cli.print.header=true;
CREATE TABLE test123 ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' LINES
TERMINATED BY '\n' STORED AS TEXTFILE LOCATION 's3://somelocation' AS
select * abc;

The column headers are missing from the S3 file.

Is this the expected behaviour or am I missing something?

Thanks,
Punit