If your usecase is just for copying files on HDFS and if there is no need
to look inside the file (parsing records, processing) then you need not use
AbstractFileInputOperator.

Instead you can use FSInputModule, HDFSFileCopyModule as done in this
application.
https://github.com/apache/apex-malhar/tree/master/apps/filecopy

Here, files will be read as raw binary data so character encoding should
not matter.

https://www.brighttalk.com/webcast/13685/194937/hadoop-ingestion-made-easy
gives some explaination on this.

Let me know if this filecopy application suits your usecase.

~ Yogi

On 9 August 2016 at 20:59, Mukkamula, Suryavamshivardhan (CWM-NR) <
[email protected]> wrote:

> Hi,
>
> I have files on HDFS with French characters that I need to write to
> another file on HDFS. I am using AbstractFileInputOperator.java which has
> the following method that can stream the input file. Can you please suggest
> how would I handle the French characters ? (I suppose I should pass the
> character encoding UTF8 to generate the inputstream but not sure how would
> I achieve that).
>
> ###############method from AbstractFileInputOperator.
> java####################
>
> *protected* InputStream openFile(Path path) *throws* IOException
>   {
>     currentFile = path.toString();
>     offset = 0;
>     retryCount = 0;
>     skipCount = 0;
>     *LOG*.info("opening file {}", path);
>     InputStream input = fs.open(path);
>     *return* input;
>   }
>
> Regards,
> Surya Vamshi
>
>
> _______________________________________________________________________
>
> If you received this email in error, please advise the sender (by return
> email or otherwise) immediately. You have consented to receive the attached
> electronically at the above-noted email address; please retain a copy of
> this confirmation for future reference.
>
> Si vous recevez ce courriel par erreur, veuillez en aviser l'expéditeur
> immédiatement, par retour de courriel ou par un autre moyen. Vous avez
> accepté de recevoir le(s) document(s) ci-joint(s) par voie électronique à
> l'adresse courriel indiquée ci-dessus; veuillez conserver une copie de
> cette confirmation pour les fins de reference future.
>
>

Reply via email to