Peter Hannam created SQOOP-1495:
-----------------------------------
Summary: EnclosedBy and EscapedBy set to \000 are not ignored
Key: SQOOP-1495
URL: https://issues.apache.org/jira/browse/SQOOP-1495
Project: Sqoop
Issue Type: Bug
Affects Versions: 1.4.5
Reporter: Peter Hannam
Priority: Minor
In {{DelimiterSet}} there is the following comment above two option variables:
{code:java}
// If these next two fields are '\000', then they are ignored.
private char enclosedBy;
private char escapedBy;
{code}
We just found a problem with this whilst doing a Sqoop export. Looking at the
code in {{RecordParser}} it appears that although the comment says they would
be ignored if set to \000 they actually aren't.
For some reason some of the records we're trying to export have \000 in a
column. This is fine as long as the \000 isn't just before the delimiter.
This is fine {{foo\000bar|moo}} - two columns are exported.
This isn't fine {{foo\000|bar}} - only one column is exported.
Looking through {{RecordParser}} the problem is that our \000 character is
being assumed to be an enclosing character, so it's then assuming the delimiter
is part of a value. We've set {{enclosedBy}} to be \000 as a default, let's
ignore it value, but then we're encountering \000 and it's being picked up.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)