Indeed. I created SQOOP-1678 is intended to address #1. Let me re-define it...
Also, for #2... There are a few ways of generating output. It seems NULL values range from "\N" to 0x0 to "NULL". I think keeping NULL makes sense. On Mon, Dec 1, 2014 at 10:58 AM, Jarek Jarcec Cecho <[email protected]> wrote: > I do share the same point of view as Gwen. The CSV format for UDF is very > strict so that we have minimal surface area for inconsistencies between > multiple connectors. This is because the IDF is an agreed upon exchange > format when transferring data from one connector to the other. That however > shouldn't stop one connector (such as HDFS) to offer ways to save the > resulting CSV differently. > > We had similar discussion about separator and quote characters in > SQOOP-1522 that seems to be relevant to the NULL discussion here. > > Jarcec > > > On Dec 1, 2014, at 10:42 AM, Gwen Shapira <[email protected]> wrote: > > > > I think its two different things: > > > > 1. HDFS connector should give more control over the formatting of the > > data in text files (nulls, escaping, etc) > > 2. IDF should give NULLs in a format that is optimized for > > MySQL/Postgres direct connectors (since thats one of the IDF design > > goals). > > > > Gwen > > > > On Mon, Dec 1, 2014 at 9:52 AM, Abraham Elmahrek <[email protected]> > wrote: > >> Hey guys, > >> > >> Any thoughts on where configurable NULL values should be? Either the > IDF or > >> HDFS connector? > >> > >> cf: https://issues.apache.org/jira/browse/SQOOP-1678 > >> > >> -Abe > >
