Re: Configurable NULL in IDF or Connector?

Veena Basavaraj Mon, 01 Dec 2014 11:34:43 -0800

+1 Gwen.

Lets have a separate ticket for #2, since this should be part of the Sqoop
guidelines, esp for the CSV String





Best,
*./Vee*

On Mon, Dec 1, 2014 at 11:26 AM, Abraham Elmahrek <[email protected]> wrote:

> Indeed. I created SQOOP-1678 is intended to address #1. Let me re-define
> it...
>
> Also, for #2... There are a few ways of generating output. It seems NULL
> values range from "\N" to 0x0 to "NULL". I think keeping NULL makes sense.
>
> On Mon, Dec 1, 2014 at 10:58 AM, Jarek Jarcec Cecho <[email protected]>
> wrote:
>
> > I do share the same point of view as Gwen. The CSV format for UDF is very
> > strict so that we have minimal surface area for inconsistencies between
> > multiple connectors. This is because the IDF is an agreed upon exchange
> > format when transferring data from one connector to the other. That
> however
> > shouldn't stop one connector (such as HDFS) to offer ways to save the
> > resulting CSV differently.
> >
> > We had similar discussion about separator and quote characters in
> > SQOOP-1522 that seems to be relevant to the NULL discussion here.
> >
> > Jarcec
> >
> > > On Dec 1, 2014, at 10:42 AM, Gwen Shapira <[email protected]>
> wrote:
> > >
> > > I think its two different things:
> > >
> > > 1. HDFS connector should give more control over the formatting of the
> > > data in text files (nulls, escaping, etc)
> > > 2. IDF should give NULLs in a format that is optimized for
> > > MySQL/Postgres direct connectors (since thats one of the IDF design
> > > goals).
> > >
> > > Gwen
> > >
> > > On Mon, Dec 1, 2014 at 9:52 AM, Abraham Elmahrek <[email protected]>
> > wrote:
> > >> Hey guys,
> > >>
> > >> Any thoughts on where configurable NULL values should be? Either the
> > IDF or
> > >> HDFS connector?
> > >>
> > >> cf: https://issues.apache.org/jira/browse/SQOOP-1678
> > >>
> > >> -Abe
> >
> >
>

Re: Configurable NULL in IDF or Connector?

Reply via email to