+1 Gwen. Lets have a separate ticket for #2, since this should be part of the Sqoop guidelines, esp for the CSV String
Best, *./Vee* On Mon, Dec 1, 2014 at 11:26 AM, Abraham Elmahrek <[email protected]> wrote: > Indeed. I created SQOOP-1678 is intended to address #1. Let me re-define > it... > > Also, for #2... There are a few ways of generating output. It seems NULL > values range from "\N" to 0x0 to "NULL". I think keeping NULL makes sense. > > On Mon, Dec 1, 2014 at 10:58 AM, Jarek Jarcec Cecho <[email protected]> > wrote: > > > I do share the same point of view as Gwen. The CSV format for UDF is very > > strict so that we have minimal surface area for inconsistencies between > > multiple connectors. This is because the IDF is an agreed upon exchange > > format when transferring data from one connector to the other. That > however > > shouldn't stop one connector (such as HDFS) to offer ways to save the > > resulting CSV differently. > > > > We had similar discussion about separator and quote characters in > > SQOOP-1522 that seems to be relevant to the NULL discussion here. > > > > Jarcec > > > > > On Dec 1, 2014, at 10:42 AM, Gwen Shapira <[email protected]> > wrote: > > > > > > I think its two different things: > > > > > > 1. HDFS connector should give more control over the formatting of the > > > data in text files (nulls, escaping, etc) > > > 2. IDF should give NULLs in a format that is optimized for > > > MySQL/Postgres direct connectors (since thats one of the IDF design > > > goals). > > > > > > Gwen > > > > > > On Mon, Dec 1, 2014 at 9:52 AM, Abraham Elmahrek <[email protected]> > > wrote: > > >> Hey guys, > > >> > > >> Any thoughts on where configurable NULL values should be? Either the > > IDF or > > >> HDFS connector? > > >> > > >> cf: https://issues.apache.org/jira/browse/SQOOP-1678 > > >> > > >> -Abe > > > > >
