I do share the same point of view as Gwen. The CSV format for UDF is very strict so that we have minimal surface area for inconsistencies between multiple connectors. This is because the IDF is an agreed upon exchange format when transferring data from one connector to the other. That however shouldn't stop one connector (such as HDFS) to offer ways to save the resulting CSV differently.
We had similar discussion about separator and quote characters in SQOOP-1522 that seems to be relevant to the NULL discussion here. Jarcec > On Dec 1, 2014, at 10:42 AM, Gwen Shapira <[email protected]> wrote: > > I think its two different things: > > 1. HDFS connector should give more control over the formatting of the > data in text files (nulls, escaping, etc) > 2. IDF should give NULLs in a format that is optimized for > MySQL/Postgres direct connectors (since thats one of the IDF design > goals). > > Gwen > > On Mon, Dec 1, 2014 at 9:52 AM, Abraham Elmahrek <[email protected]> wrote: >> Hey guys, >> >> Any thoughts on where configurable NULL values should be? Either the IDF or >> HDFS connector? >> >> cf: https://issues.apache.org/jira/browse/SQOOP-1678 >> >> -Abe
