Re: Handling incoming file names that contain embedded spaces

James McMahon Thu, 15 Dec 2016 03:08:29 -0800

As a representative example using a random Unicode character at the front
and the back of a notional file name, [U+0932][U+0932][U+0932]+123
ABC[U+07C1]


On Wed, Dec 14, 2016 at 7:22 PM, James McMahon <[email protected]> wrote:

> Yes indeed Joe, it appears from the logs that there are non-ASCII unicode
> characters preceding and at end of the file name. The log shows them as odd
> representations of "unprintables" - for example, small inverted question
> marks in diamonds, etc etc. They are embedded in the file names by the
> application that created the files. I copied and tried to paste and save
> into a text file, and notepad directed me to switch to another encoding in
> order to save the file name string. I was able to get it to save by
> switching to Unicode encoding.
>
> I can't send the logs from my system. I can only relay this in this way.
> Would you expect that such character encoding would cause problems for
> GetFile? What alternatives do I have to work around this problem? Thank you
> once again.
>
> On Wed, Dec 14, 2016 at 6:04 PM, Joe Witt <[email protected]> wrote:
>
>> James,
>>
>> I suspect there is more to the issue than the spaces.  GetFile itself
>> should be fine there.  Can you share logs showing what is happening
>> with these files?  Can you share some sample filenames that it is
>> struggling with?  You can also enable debug logging for that processor
>> which could provide some interesting details as well.
>>
>> Thanks
>> Joe
>>
>> On Wed, Dec 14, 2016 at 5:03 PM, James McMahon <[email protected]>
>> wrote:
>> > I am using NiFi 0.6.1. I am trying to use GetFile to read in a large
>> series
>> > of files I have preprocessed outside of NiFi from zip files using bash
>> shell
>> > commands. GetFile is throwing errors on many of these files because the
>> > files contain embedded spaces. Is there a way to tell NiFi to handle
>> each
>> > such filename with surrounding single quotes? Are there other processor
>> > options better suited to handle this challenge? Thank you.
>>
>
>

Re: Handling incoming file names that contain embedded spaces

Reply via email to