[ 
https://issues.apache.org/jira/browse/HADOOP-4212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12633665#action_12633665
 ] 

Steve Loughran commented on HADOOP-4212:
----------------------------------------

-You can look for the xml:space attribute on any element and act on it; when 
working with XSD-schema'd docs I think xerces behaves differently when it hits 
it, but I forget these things.

-yes, it would cause windows to behave differently and not allow filenames with 
trailing spaces, or other strings. But I dont see that filenames with trailing 
spaces and carriage returns do actually make sense, even on windows. Spaces 
mid-path, maybe, but leading or trailing? Danger.

FWIW, I'm not using the XML format for our configurations; we use our own 
configuration format

http://smartfrog.svn.sourceforge.net/viewvc/smartfrog/trunk/core/components/hadoop/src/org/smartfrog/services/hadoop/components/hadoopconfiguration.sf?view=markup

Looking at the current declarations, there's nowhere where white space is 
useful, and there are places (in comma separated lists), where it may already 
be harmful and need filtering. There may be some inconsistency between 
filenames (HADOOP-2366) and user group information, where spaces between words 
are allowed in hadoop.job.ugi. I would propose

-consistent filtering of spaces wherever lists are taken (strip leading, 
trailing), 
-trim leading, tailing whitespace

What may make sense is to allow quoted whitespace, so you could have a list of 
directories, those in quotes would be passed down as is:

<name>dfs.data.dir</name>
<value>/mnt/hstore2/hdfs , "/home/user2/temp hadoop dir"</value> 

This would resolve to a list with two entries 
["/mnt/hstore2/hdfs","/home/user2/temp hadoop dir"]





> New lines and leading spaces are not trimmed of a value when configuration is 
> read
> ----------------------------------------------------------------------------------
>
>                 Key: HADOOP-4212
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4212
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: conf
>    Affects Versions: 0.18.1
>         Environment: Generic
>            Reporter: Sreekanth Ramakrishnan
>            Assignee: Sreekanth Ramakrishnan
>            Priority: Minor
>         Attachments: HADOOP-4212-1.patch, HADOOP-4212-TESTCASE.patch
>
>
> While configuration value is read the leading and trailing spaces and new 
> line characters are taken into account.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to