As a representative example using a random Unicode character at the front and the back of a notional file name, [U+0932][U+0932][U+0932]+123 ABC[U+07C1]
On Wed, Dec 14, 2016 at 7:22 PM, James McMahon <[email protected]> wrote: > Yes indeed Joe, it appears from the logs that there are non-ASCII unicode > characters preceding and at end of the file name. The log shows them as odd > representations of "unprintables" - for example, small inverted question > marks in diamonds, etc etc. They are embedded in the file names by the > application that created the files. I copied and tried to paste and save > into a text file, and notepad directed me to switch to another encoding in > order to save the file name string. I was able to get it to save by > switching to Unicode encoding. > > I can't send the logs from my system. I can only relay this in this way. > Would you expect that such character encoding would cause problems for > GetFile? What alternatives do I have to work around this problem? Thank you > once again. > > On Wed, Dec 14, 2016 at 6:04 PM, Joe Witt <[email protected]> wrote: > >> James, >> >> I suspect there is more to the issue than the spaces. GetFile itself >> should be fine there. Can you share logs showing what is happening >> with these files? Can you share some sample filenames that it is >> struggling with? You can also enable debug logging for that processor >> which could provide some interesting details as well. >> >> Thanks >> Joe >> >> On Wed, Dec 14, 2016 at 5:03 PM, James McMahon <[email protected]> >> wrote: >> > I am using NiFi 0.6.1. I am trying to use GetFile to read in a large >> series >> > of files I have preprocessed outside of NiFi from zip files using bash >> shell >> > commands. GetFile is throwing errors on many of these files because the >> > files contain embedded spaces. Is there a way to tell NiFi to handle >> each >> > such filename with surrounding single quotes? Are there other processor >> > options better suited to handle this challenge? Thank you. >> > >
