[
https://issues.apache.org/jira/browse/HADOOP-5307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12681732#action_12681732
]
Enis Soztutar commented on HADOOP-5307:
---------------------------------------
bq. These read as mutually exclusive statements. If escaping "__null__" is
backwards-incompatible, then escaping null is, too. bq. That any escape
sequence needs a way to represent itself literally isn't controversial.
Further, the change is to a public utility class: even if we defined
"incompatible" statistically, the frequency of either case is unknown since its
domain is undefined.
Of course the patch is technically backwards-incompatible, what I say is that,
the frequency is expected to be so small that it is negligible.
bq. I'm a little confused about why StringUtils needs its own escape for a null
reference. Converting an array of String with a null reference to a String with
a special symbol for null is odd. It's cleaner to throw from StringUtils if it
gets null and require callers to handle escaping.
passing an array, possibly containing null values seems to me generic enough to
be introduced in StringUtils rather than a custom solution in the
context(DBOutputFormat).
bq. HADOOP-4955 probably should not have been committed to 0.19.1. It reads as
an improvement, not a fix to a regression.
The API indicates to use the supplied column names, when in fact it did not, so
4955 is a bug fix, fixing the expected behavior from the API, rather than an
improvement, introducing a new API.
bq. What was committed didn't match the patch posted here. There was a conflict
in the imports for StringUtils.
The patch is committed w/o changes to import statements. Committing a patch
with minor changes is not uncommon. However I should have stated that.
bq. One of the cases added in testArrayToStringToArray (nullArr3) doesn't have
a corresponding test.
Will fix
At this point I think we can continue with the patch in three ways :
1. Introduce new String[] -> String, and String -> String[] array methods
handling possible null values.
2. Mark the patch incompatible and commit this one
3. introduce a new API to set only the number of columns(not the names) in the
DBConfiguration. However this does not fix the passing null values in
configuration problem, if there is such.
What would you suggest?
> Fix null value handling in StringUtils#arrayToString() and #getStrings()
> ------------------------------------------------------------------------
>
> Key: HADOOP-5307
> URL: https://issues.apache.org/jira/browse/HADOOP-5307
> Project: Hadoop Core
> Issue Type: Bug
> Components: util
> Affects Versions: 0.21.0
> Reporter: Enis Soztutar
> Assignee: Enis Soztutar
> Attachments: h5307_v1.patch
>
>
> StringUtils#arrayToString() converts String array to a String of comma
> separated elements. If the String array includes null values, these are
> recovered as "null" (literal) from getStrings() method, which eventually
> causes configuration issues.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.