Hello Ram/Team, My requirement is to read input feeds from different locations on HDFS and parse those files by reading XML configuration files (each input feed has configuration file which defines the fields inside the input feeds).
My approach : I would like to define a mapping file which contains individual feed identifier, feed location , configuration file location. I would like to read this mapping file at initial load within setup() method and define my DirectoryScan.acceptFiles. Here my challenge is when I read the files , I should parse the lines by reading the individual configuration files. How do I know the line is from particular file , if I know this I can read the corresponding configuration file before parsing the line. Please let me know how do I handle this. Regards, Surya Vamshi From: Munagala Ramanath [mailto:[email protected]] Sent: 2016, May, 24 5:49 PM To: Mukkamula, Suryavamshivardhan (CWM-NR) Subject: Multiple directories One way of addressing the issue is to use some sort of external tool (like a script) to copy all the input files to a common directory (making sure that the file names are unique to prevent one file from overwriting another) before the Apex application starts. The Apex application then starts and processes files from this directory. If you set the partition count of the file input operator to N, it will create N partitions and the files will be automatically distributed among the partitions. The partitions will work in parallel. Ram _______________________________________________________________________ This [email] may be privileged and/or confidential, and the sender does not waive any related rights and obligations. Any distribution, use or copying of this [email] or the information it contains by other than an intended recipient is unauthorized. If you received this [email] in error, please advise the sender (by return [email] or otherwise) immediately. You have consented to receive the attached electronically at the above-noted address; please retain a copy of this confirmation for future reference.
