Re: Configurable NULL in IDF or Connector?

Jarek Jarcec Cecho Mon, 01 Dec 2014 10:59:17 -0800

I do share the same point of view as Gwen. The CSV format for UDF is very 
strict so that we have minimal surface area for inconsistencies between 
multiple connectors. This is because the IDF is an agreed upon exchange format 
when transferring data from one connector to the other. That however shouldn't 
stop one connector (such as HDFS) to offer ways to save the resulting CSV 
differently.


We had similar discussion about separator and quote characters in SQOOP-1522 
that seems to be relevant to the NULL discussion here.

Jarcec

> On Dec 1, 2014, at 10:42 AM, Gwen Shapira <[email protected]> wrote:
> 
> I think its two different things:
> 
> 1. HDFS connector should give more control over the formatting of the
> data in text files (nulls, escaping, etc)
> 2. IDF should give NULLs in a format that is optimized for
> MySQL/Postgres direct connectors (since thats one of the IDF design
> goals).
> 
> Gwen
> 
> On Mon, Dec 1, 2014 at 9:52 AM, Abraham Elmahrek <[email protected]> wrote:
>> Hey guys,
>> 
>> Any thoughts on where configurable NULL values should be? Either the IDF or
>> HDFS connector?
>> 
>> cf: https://issues.apache.org/jira/browse/SQOOP-1678
>> 
>> -Abe

Re: Configurable NULL in IDF or Connector?

Reply via email to