Any specific reason for using Storm for this use case and not Sqoop? From: Aiman Najjar [mailto:[email protected]] Sent: 03 November 2014 15:25 To: [email protected] Subject: Storm Trident use case
Hello, I'm new to storm, I'm trying to build a trident topology that exports rows from an oracle db to hdfs. I'm using an existing implementation for HdfsState I wrote my own trident spout that emits tuples in 1-hour time slices (so the coordinator metadata are timestamps), so for instance, if the time now is 5:30pm, and the current position of the spout coordinator is 12pm, the spout will emit 5 batches of tuples. My problem is that I'm not sure if I'm taking advantage of storm's distributed processing, will those 5 batches be processed in parallel? Is my design optimal? Here's how my topology looks like: topology.newStream(tableName + "_AUDIT_STREAM", auditSpout) .partitionPersist(factory, tableFields, new HdfsUpdater(), new Fields()); Thanks ______________________________________________________________________ Disclaimer: This email and any attachments are sent in strictest confidence for the sole use of the addressee and may contain legally privileged, confidential, and proprietary data. If you are not the intended recipient, please advise the sender by replying promptly to this email and then delete and destroy this email and any attachments without any further use, copying or forwarding.
